Releasing O3 Ref App 3.0.0-rc

Dear O3 Community,

Namely: @AMPATH @Mekom @OHRI @PalladiumKenya @PIH @UWash

We would like to suggest a way forward to obtain a release candidate for O3 now that we have been through multiple alpha and beta releases.

The process would be the following:

  1. On the latest beta, identify the features/screens/widgets/etc that are not sufficiently stable to be part of the release.
  2. Branch a new branch 3.1.x off main.
  3. On branch main: ideally remove (or alternatively disable) those not-yet-ready features/screens/widgets.
  4. Make a release candidate from main.
  5. Do an in-depth QA of the release candidate.
  6. Patch the release candidate as long as necessary on main.
  7. Produce release notes for the release candidate, make the necessary announcements, etc.
  8. Eventually, resume branch main off 3.1.x, and branch 3.0.x off main (for further 3.0 patches back ports when needed.)

In the future we would just repeat this process when needing a release candidate for 3.1, 3.2, etc.

I would like to hear your thoughts about the above, surely we would need to discuss this on a TAC call?

Cc @burke @dennis @dkayiwa @frederic.deniger @grace @ibacher @mogoodrich @mseaton


@mksd I think this is indeed needed am looking forward to discuss this during the TAC Call this or next week. @OHRI is in implementation phase and we often get the same question from the different countries about a stable release of O3. I would like to suggest we include the work currently happen on the #o3-performance “squad” to this release.

Sounds good to me… though I’m not well-versed enough in the overall structure to have an very informed description.

Note that I do know that the Dispensing app is definitely very much a work-in-progress, so I would strongly recommend removing it from the release.

Take care, Mark

Thanks @mogoodrich, and that’s fine of course, as long as we can all together go through the exercise of associating features with a target version for which we can produce

  1. A stable reliable version to which patches can be backported.
  2. Official release notes that can be shared publicly.

I have discussed this briefly with @ibacher also over a call yesterday and he has also a couple of good guesses of what could and what cannot be released.

@eudson sure, let’s discuss this very soon at a TAC. @grace? I also requested this here on Slack.

Can we use this to also come up with some sort of roadmap that we can clearly document as we used to do here? Technical Road Map - Documentation - OpenMRS Wiki

@dkayiwa yes absolutely. At the moment when we make decisions as to what goes in which version, so here we will basically discriminate between what goes in 3.0 and what should be planned for 3.1, then we will be in a position to start documenting this transparently on the wiki or wherever appropriate.

1 Like

Good discussion on this on today’s TAC call.

I’ll try to summarize our general consensus (probably with mistakes):

  • Define a release manager (for initial 3.0 release, likely multiple people) to oversee & facilitate release
  • Document the release process. For our first release of 3.x, this would include not only public documentation of what is going into 3.0.0 (release notes and progress tracking pages) but also creating the first draft of a release process “recipe” for future 3.x releases (like we’ve done for 2.x and platform).
  • As a community, prioritize getting a 3.0.0 release
    • While we work to get the main branch ready to cut a 3.0.x branch, we defer merging broken or partially working features unless they are placed behind a development toggle (i.e., if (process.env.NODE_ENV === 'development') { ... })
    • Work as quick as we can to get the feature set for 3.0.0 defined (and documented in a release planning wiki page), so we can cut a 3.0.x branch. This branch would define explicit versions of features/ESMs to be included in 3.0.0. Any additional changes needed for 3.0.0 release would be cherry-picked from main
  • Work toward a culture of “ready for release” development, where development that’s not ready for release is limited to NODE_ENV == 'development' builds.
    • dev3 would run code built in development mode (to see bleeding edge changes)
    • test3 would run code built in production mode (to see app without features in development)
    • uat3 could be used to build from a pending release branch (if we want/need an environment that excludes features not specifically planned for an impending release)

The “exception” for 3.0.0 would be to delay (as briefly as we can) works in progress from being merged to main while we get our heads around cutting a 3.0.x branch unless that work is limited to development builds.

A post was split to a new topic: O3 Release - List of backend modules

There has to be a lead/overall release manager, who is answerable to the general progress of the release process. Else we shall fall into the trap of multiple people waiting on each other to do the thing.

I have not seen devs merging features which they know are broken. It has always been after merging and testing that we somehow figured out that this and that has just got broken. So, they will not put features behind a development toggle because they assume nothing is gonna get broken. The point am driving at is, we should cut 3.0.x and not pretend that we shall safely restrict merges on main. We do not want to create a situation where people are scared of merging on main. We still are in need of lots of new features on main for a fully working EMR and anything which will slow down main is something we should avoid. Going forward, we should improve our automated tests to the level where they will fail on pull requests and hence avoid merging them.

How brief is this going to be? Is it hours, days, weeks? And in what ways is this delay going to help us better than just cutting a 3.0.x branch? Isn’t is safer to just cut a 3.0.x branch than leave it at the mercy of manually gatekeeping main?

One area I have seen this in the past is where we have a well-designed set of mock-ups for a particular feature, and efforts are made to match these mock-ups (with dummy data and stubs), particularly if the front-end is waiting on back-end endpoints or designs. Hopefully we can avoid this practice moving forward, but just wanted to highlight this as a possible example of this in the past.

Fair point. Perhaps, for such a big release, we should say a release manager with a release management team/task force behind them.

Another fair point. Instead of “broken,” I could have better said as “not planned to be part of 3.0.0.”

I agree that cutting 3.0.x branch would avoid the pain of trying to prepare the release on the main branch. I believe the fundamental concern is we don’t want the release of 3.0.0 to be a “side project” for the community handled by a few people. We need as many people as possible to help get this ship in the water.

Perhaps getting a release manager with a small team to start identifying & documenting what will and won’t be part of 3.0.0 will help answer these questions. If the bulk of the work needed is setting explicit versions of ESMs, then that can be done in a 3.0.x branch. If there is more work needed, then the time needed and a strategy for getting it done (whether in main or a 3.0.x branch) can be decided based on the specific work needed.

AFAIK these are gone.

This is why I suggested that the standard for a keeping something under a feature flag is just that no e2e tests have been written to cover the feature yet. It’s not about the feature being “broken”, it’s about establishing that we have a way to check that the feature works as expected before releasing it.

I don’t see that branching here really solves the problem of manual gatekeeping. A couple of points:

  1. For the most part, I’m comfortable that what’s in the 3.x branch of OpenMRS Distro Reference Application (basically some Docker files, metadata, and distribution setup) are in a state where we don’t need a freeze. We have a release process for this that, while manual, works reasonably well.
  2. Where we need the branching strategy is actually in the frontend modules themselves, so something like 8 or so different repos.
  3. The idea of the freeze (so what we would be trying to replace with branching) is essentially to allow us to mark certain features in the code that are determined to be out-of-scope such that they won’t be available in the production build (we still need to identify which features are in-scope and out-of-scope for the 3.0 release). Ultimately, this is a pattern we want adopted on main. So our options are: branch and maintain two branches per-repo up until the point where we release or have a brief pause to implement the feature flags and then allow main-line development to continue with new code adopting the new pattern.
  4. Realistically speaking, almost all frontend development (at least in the parts we’re likely to target for 3.0) is done by four organisations: Palladium, Mekom, Brown, and OpenMRS. These organisations account for 92% of the commits in the last version of esm-core, 85% of the commits to the last version of esm-patient-chart, and 92% of the commits to the last version of patient-management. Those are also, for the most part, the orgs that we’re asking to not implement to features. To contextualise a bit when I suggested having a “pause” concretely this means: development continues as normal up until the point where we work out what features are out-of-scope and a strategy for implementing feature flags, then we ask these four organisations to implement the feature flags for those out-of-scope features, then we resume normal development (with a feature-flag pattern in place that we then expect new contributors to adopt).
  5. Maybe the language of “pause” is a bit of a misnomer: we want to direct community development efforts towards the concrete things we need to have happen to get the code base in a state where we can release it.
  6. I absolutely agree with the underlying principle that we don’t want to do a “code-freeze” process or that if a “code-freeze” were what we were talking about, branching would make sense. But this isn’t about “freezing” code. This is about taking the time necessary to implement the patterns that allow us to continue working without needing to freeze the code.