OpenMRS startup time analysis

Yes, will do when I have this is a stable state.

Yes. However, I’m not running in Docker and we (PIH) are not yet running in Docker. We don’t publish different war variants. And we don’t publish different module variants. I haven’t tried running the PIH EMR distro on a version of Java > 8 in a few months - that’s something we plan to do as a follow-up to this 2.8 upgrade work - but last I tried it failed.

Makes sense. Testing Java > 8 after the 2.8 upgrade sounds like a good next step.

As a quick update on this topic, I did some further diving into the startup time of the distributions we use at PIH and found that, well - most implementations will likely see the most expensive performance issues due to modules that they have installed, rather than a specific core version, and the biggest initial improvements will likely come from upgrading and continuing to investigate and fix slow module issues, rather than try to shave a second or two off of the core startup process (though both should proceed for sure).

A quick summary of our most widely used distribution at PIH, as analyzed from a local SDK instance:

The exact impact that the event module will have on a given implementation will depend on the number of events that are being listened for. The more events are used in an implementation, or the more modules that subscribe to events, the bigger the event module impact.

Key takeway:

  • Upgrading attachments and event modules will likely have the biggest positive impact on startup times for distributions
1 Like

Thanks @mseaton. The attachment module didn’t result in that sluggish startup in the O3 setup. I see you have 52 modules in the PIH distro, whereas O3 has only 31 modules, which contributed to the effect. OpenMRS Core 3.0.x introduces a way to avoid creating OpenmrsServices with xml so hopefully we will eventually get rid of the issue that we hit in the attachment module when mixing xml and annotations. It was an issue that was fixed in a few modules in the past…

We have TRUNK-6421 to make it easier to discover such issues.

I’ll test O3 startup after upgrading the attachment and the event module.

1 Like

Part of the reason I’m hoping we can get some better logging, etc. to measure performance is precisely because these performance and startup issues are often distribution-specific (this is largely because startup issues in the RefApp itself tend to be noticed faster).

For example, as you say, the attachments module doesn’t really cause issues in the O3 RefApp, but we’ve frequently had cases like this where the Spring configuration of two modules causes Spring to do something catastrophically slow that isn’t reproducible with either module in isolation.

Similarly, I expect the overall effect of the event module update on the O3 startup time to be 0, but that’s because the O3 RefApp doesn’t actually have any event listeners registered and the catastrophic performance hit only comes if you have at least two.

Happy to report back that I’m done implementing performance tests in TRUNK-6420.

It is a very simple setup. It’s enabled for openmrs-core 2.9.x and 3.0.x. It’s a suite of 3 tests.

In each test we run openmrs-core, openmrs-platform and openmrs-referencepplication-3-backend docker images and measure mean startup time of 3 runs. Prior to the 3 runs we do an unmeasured run to install the app, which we don’t include in the mean startup time.

For 2.9.x we run versions of images that have the openmrs-core 2.8.x war, which gives us a baseline startup time and compare them to versions of images with a war file from the current 2.9.x build. For 3.0.x we use openmrs-core 2.9.x as a baseline.

The tests report back the mean startup time and difference in seconds between versions. They are also designed to fail, if the difference exceeds a given time. We allow for 10s of difference between versions on top of that so that we don’t see failures caused by hardware glitches…

This is what tests report back:

[INFO] -------------------------------------------------------
[INFO]  T E S T S
[INFO] -------------------------------------------------------
[INFO] Running org.openmrs.StartupPerformanceIT
INFO - StartupPerformanceIT.compareStartupPerformance(192) |2025-12-08T11:24:39,795| openmrs/openmrs-platform:2.8.x started up in 30s, while openmrs/openmrs-platform:2.9.x started up in 36s with the latter starting slower by 5s
INFO - StartupPerformanceIT.compareStartupPerformance(192) |2025-12-08T11:28:06,754| openmrs/openmrs-core:2.8.x started up in 21s, while openmrs/openmrs-core:2.9.x started up in 24s with the latter starting slower by 2s
INFO - StartupPerformanceIT.compareStartupPerformance(192) |2025-12-08T11:36:14,612| openmrs/openmrs-reference-application-3-backend:3.6.x-no-demo started up in 51s, while openmrs/openmrs-reference-application-3-backend:3.6.x-no-demo started up in 54s with the latter starting slower by 3s
[INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1004 s -- in org.openmrs.StartupPerformanceIT

Taken from TRUNK-6420: Do not run performane test on old java versions · openmrs/openmrs-core@3844520 · GitHub

Performance tests do not run by default so if you want to try them locally you need to run:

mvn clean install -Pperformance-test -Pskip-default-test

If you want to test your own distro, you simply adjust the docker image used for testing.

Performance tests are run on GHA for master and 2.9.x branches. They do not run for PRs.

I also adjusted our GHA for master and 2.9.x to run unit test, integration-tests and performance-test in parallel jobs so they complete faster for all builds (including PRs).

We can now easily compare startup time improvements in 2.9.x versus 2.8.x.

We may decide to run these tests as part of the O3 build as well to be able to detect performance issues, if things change in O3 and not just openmrs-core. They are available in the dev variant image of openmrs-core so it should be easy to run them in other builds.

1 Like

@mseaton @ibacher FYI I just ran the O3 startup performance test on my machine with the released version of attachments and events module and compared it to the latest build with SNAPSHOT versions. It reported:

INFO - StartupPerformanceIT.compareStartupPerformance(192) |2025-12-08T14:08:07,027| openmrs/openmrs-reference-application-3-backend:2.8.x-local started up in 43s, while openmrs/openmrs-reference-application-3-backend:3.6.x-no-demo started up in 40s with the latter starting faster by 4s

It is indeed a negligible difference.

Created TRUNK-6495. I think we already have some good ways to monitor and troubleshoot startup performance issues and it’s a matter of documenting them in one place.

1 Like

Thank you so much for this awesome work @raff :+1:

Do i need to do anything else other than just running mvn clean test -Pperfromance-test -Pskip-default-test?

I am getting this: gist:4c932d9d2724b456ad65f3df9f4398cc · GitHub

Misspelled performance… :slight_smile:

mvn clean install -Pperformance-test -Pskip-default-test

[corrected in the original post as well]

Also note that in the 3.0.0-SNAPSHOT platform and O3 tests are disabled as most modules do not run on 3.0.x yet.

Awesome!

Now i get this: gist:c0f183d708a1bf9308cc17eaf9f23a0f · GitHub

Don’t you have docker installed? Docker desktop enabled? It’s an issue with testcontainers not finding your Docker install. See some possible fixes https://stackoverflow.com/questions/61108655/test-container-test-cases-are-failing-due-to-could-not-find-a-valid-docker-envi

FWIW, iam running docker desktop version 4.15.0 (93002)

@raff it was taking me a long time to resolve this error. So i took the shortcut of locally using testcontainers version 1.21.3 instead of 2.0.2, which did the trick. :slight_smile:

@raff is this some sort of rounding off error?

`INFO - StartupPerformanceIT.compareStartupPerformance(195) |2025-12-09T23:04:04,867| openmrs/openmrs-core:2.9.x started up in 10s, while openmrs/openmrs-core:3.0.x started up in 15s with the latter starting slower by 4s`

Yes, it’s rounding. I’ll adjust.

Interesting. I bet you tried upgrading Docker Desktop… I would think it’s relying on some older client version.

For how long should i wait for this GHA to complete? Here is an example of a pull request that i am looking at: TRUNK-6231: Align Person with common attribute model and fix PersonAttribute pers… by Harish-Kumar-2049 · Pull Request #5591 · openmrs/openmrs-core · GitHub

I don’t think it runs perf tests on PRs anymore (see here). The problem is that the branch rule now doesn’t correctly match the build because it’s sensitive to the order of properties in the build strategy matrix. So the build looks hung because the build it is looking for no longer exists. I think I fixed this.

The branch rule referred to is the ruleset we use to ensure that the build passes on CI before the PR gets marked as mergeable.

Correct, performance tests are not run for PRs. They take a bit longer to complete and I didin’t want to slow down PRs. @ibacher what was the fix? I don’t see anything committed. Is there anything else to be fixed?

@raff what is the expected normal time that these tests should take to run? In otherwords, a time beyond which we would take it as signal that something is wrong?