O3: Merging Monorepos

zacbutko · September 2, 2022, 7:35am

Hi Team,

We’ve been making good shifts to the codebase to help modernize it, clean cruft, and extend our testing coverage. One idea I’ve been floating in the community for months now is to merge our monorepos together. I believe that now is the time to start talking about whether, what, when, and how to do this.

Should we merge monorepos?

Having a monorepo structure instead of individual packages scattered across an open-source community was a big win for the project, as it generally is for all projects that use it. Just read Babel’s motivation on the topic. In fact, it’s so well stated I’ll just copy paste most of it here.

Juggling a multimodule project over multiple repos is like trying to teach a newborn baby how to ride a bike.

Babel follows a monorepo approach, all officially maintained modules are in the same repo.

This is quite taboo but let’s look at the pros and cons:

Pros:

Single lint, build, test and release process.

Easy to coordinate changes across modules.

Single place to report issues.

Easier to setup a development environment.

Tests across modules are run together which finds bugs that touch multiple modules more easily.

Cons:

Codebase looks more intimidating.

Repo is bigger in size.

Can’t npm install modules directly from GitHub

???

(That last point about not being able to install modules directly is actually an issue from 2012 which is now OBE thanks to Yarn’s workspaces)

OpenMRS has adopted a monorepo strategy for all of the reasons above, but somehow ended up with multiple monorepos, which reintroduces the exact complexity monorepos hope to eliminate. For developing peripheral applications this is appropriate, but for the core ‘reference application’ this unnecessary complexity is causing some real problems.

Large delays to work due to time spent keeping tooling and all dependent packages up to date. I estimate for my workflow switching contexts takes on average a full hour more than it should out of my dev day every day.
We want to start making centralized reusable components. Right now we don’t have a good way to export libraries from one monorepo for use in another monorepo. I once tried adding a build step to esm-patient-common-lib to use one of its functions in a different project and it took down dev3 for several days. Fixing that is, I think, not the right problem to be solving. Instead we should lean into the @openmrs/esm-framework core library we already have, but do away with having to develop in one monorepo while testing its affects in another.
We want to reduce loading time by removing the use of extensions where a direct component could be used. Currently this is hard to do because we don’t have full-scale integration tests, so if we remove an extension in one monorepo we don’t know if it will cause other modules to fail, since all of our tests just mock the interface. Having integration tests across multiple monorepos is an interesting project which is currently underway, but moving the core features to the same monorepo means we could run better tests on every push and know instantly when interdependent features are missing their dependency.

This list is starting look like Babel’s list in favor of switching to monorepos so I will stop here.

Which monorepos should be merged?

My hope is to combine all reference application packages into one monorepo. Currently the definition of which modules appear on dev3 is listed in the digital ocean importmap.json, plus those apps already listed the esm-core repo, plus the form engine.

["@openmrs/esm-home-app","@openmrs/esm-form-entry-app","@openmrs/esm-patient-chart-app","@openmrs/esm-patient-registration-app","@openmrs/esm-patient-biometrics-app","@openmrs/esm-patient-banner-app","@openmrs/esm-patient-appointments-app","@openmrs/esm-patient-forms-app","@openmrs/esm-patient-vitals-app","@openmrs/esm-patient-immunizations-app","@openmrs/esm-patient-notes-app","@openmrs/esm-patient-medications-app","@openmrs/esm-patient-conditions-app","@openmrs/esm-patient-attachments-app","@openmrs/esm-patient-programs-app","@openmrs/esm-patient-allergies-app","@openmrs/esm-patient-test-results-app","@openmrs/esm-patient-clinical-view-app","@openmrs/esm-patient-search-app","@openmrs/esm-patient-list-app","@openmrs/esm-active-visits-app","@openmrs/esm-generic-patient-widgets-app","@openmrs/esm-outpatient-app","@openmrs/esm-fast-data-entry-app","@openmrs/esm-appointments-app","@openmrs/esm-cohort-builder-app","@openmrs/esm-openconceptlab-app","@openmrs/esm-devtools-app","@openmrs/esm-implementer-tools-app","@openmrs/esm-login-app","@openmrs/esm-offline-tools-app","@openmrs/esm-primary-navigation-app","@openmrs/openmrs-ngx-formentry"]

That translates into:

When should we take this on?

Selfishly I have some projects due at the end of September, so hopefully we can get started after that. October? November?

How should we do this?

Our recent campaign to upgrade the core libraries was a little rocky, and I think some lessons can be learned.

Dedicate time to do this. The 4.0 upgrade was a collaboration of several people (thanks again @ibacher @bistenes @dkigen @pirupius ) who dedicated some of their free time to trying to do the upgrade. This part-time asynchronous effort meant that the bumps and hiccups dragged on and on and on. In the end innocent bystanders had their workflow impacted because we weren’t able to (and still haven’t) iron out all the kinks. Dedicating a week of concerted effort will make sure this gets done and tested with support from all sides. This means no meetings, no deliverables, no distractions for the team doing this until O3 is running as well or better than pre-migration. Think of it as giving your devs a holiday, except they spend the holiday getting to tackle tech debt. And then give them a real holiday after.
Lean on testing. One of the things I was really hoping to have before the 4.0 upgrade was an implemented integrated test strategy - automated or manual. Luckily we had some unit tests which caught some potential bugs, but as stated above the complexity of O3 really requires integrated testing due to its extension system. A lot of failures fell through the cracks. If anyone from the QA team is willing to help step in here it would be greatly appreciated.
Plan ahead. Before we “pull the trigger” on this project I suggest using this talk post and some meetings of interested participants to define the monoreop structure, dev tooling we will want to support, ideal dev-x in this new system, impacts to CI, and how this affects implementation sites. I think we should have these questions answered, and potentially a road map for order-of-actions for executing the actual migration plus fallback options.

Going forward

Thanks for reading this missive. I’m interested to hear if this is a bad idea, will negatively impact a project, or is in any way antithetical to the OpenMRS project. Or if you think it’s a good idea that’s great to hear too.

ibacher · September 2, 2022, 4:21pm

I think this is a great idea! (As you probably already know). There are some obvious technical points that we should consider how to address ahead of time:

Versioning: This is probably the biggest thing. Right now the functional thing that the multiple-monorepo gives us is consistent versioning for sets of apps (e.g., all the apps in patient-management have the same version numbers). We probably have to make a choice around whether we want to continue versioning apps in chunks, version everything at once, or adopt a more granular versioning scheme.
Publishing: (Related to the above) Right now we publish each app to npmjs.com and the hosted importmap on every push to the main branch. This seems like it would be unsustainable with 30+ apps at a time.
Weird corner cases: Right now, for example, the App Shell embeds a set of coreapps into a core import map. Maybe this can be done away with, maybe not. Similarly, esm-patient-common-lib is a bit of an oddity. Parts of it likely make sense in the framework, but parts of it are quite specific to the layout provided by patient-chart-app (e.g., dashboard links & workspaces).
Package commands: the patient-chart repo has a script that’s already a bad idea (start-all). This becomes an even worse idea with even more packages. This is something we’ll have to deal with with the on-going CI pipeline and E2E testing work too (i.e., how do we start all and only the components we need to test).
Git Hooks: We use Git hooks pretty extensively for some verifications. This already means that pushing a branch ends up taking a while, primarily due to linting, type-checks, and running tests. Again, I think we’d want to see if we can come up with some way of isolating these (very necessary) checks a bit so we’re not creating a exponential drag of developer productivity.

We should probably need some pre-planning time to strategise on these points and others I may not have thought of.

mseaton · September 6, 2022, 2:02pm

@zacbutko thanks for laying this out. There is a lot in the post(s) above, and I’m certainly not as knowledgeable about the MFE landscape as you or @ibacher , so would tend to defer to you and others who are working hard on O3 day-to-day on this. The only worries I’d have are that if - in moving all of the core, supported packages into a single monorepo, we fail to adequately test and support the frameworks that non-core modules will rely upon to plug into this framework, whether shared components, the extension system, or whatever.

So if we’d be making things easier for those who are developing in these core modules, but not forcing ourselves to tackle and improve those issues that will be faced by those developing packages outside of this sphere, then this might be something to think about.

zacbutko · September 29, 2022, 10:59am

Just wanted to provide an update reflecting recent conversations about this outside of talk.

I think that it is possible to move forward with this transition in a much more gradual approach than I was first imagining. The fact is that we already have “apps” inside of openmrs-esm-core repo (devtools, implementer-tools, login, offline-tools, and primary-navigation), so the final structure we hope to achieve is already partly accomplished. I think we can adjust the tooling and building of packages already in esm-core to solve the issues @ibacher brings up, and then gradually add packages in one at a time to slowly test the process. A good first package to test with is probably openmrs-esm-home.

This is a valid concern. From my point of view working for Mekom we are currently working on packages which are not designed to be a part of the core ref-app. So in that sense I am personally vested in a solution which will also work for those packages. Having integration tests which allow for testing any possible combination of external packages will always be necessarily tricky and something we need to develop support for. I think the current effort from the QA team is a good example of how this can work.

The goal of the migration is not to disregard external packages but to improve relationships between the core components, allow us to more easily establish and adopt design patterns, and enable integrated CI testing of the core packages.