O3: Improving O3 Performance

pasindur2 · June 29, 2022, 3:08am

Created a Pr with 3.x E2E login test. Please review the pr. cc @jayasanka @ibacher

github.com/openmrs/openmrs-test-3refapp

O3-1375: Bring refapp 3 E2E test to the new repository (Login test)

openmrs:main ← pasindur99:O3-1375

opened 02:58AM - 29 Jun 22 UTC

pasindur99

+890 -0

## Purpose The pull request is created to bring RefApp 3.X test to a new reposi…tory and this pull request is related to the issue ticket number [O3-1375](https://issues.openmrs.org/browse/O3-1375) [O3-1360](https://issues.openmrs.org/browse/O3-1360) ## Goals - The goal of this pr is to bring the 3.x E2E automated test framework into a new place. ## Approach - Restore the 3.x E2E testing framework.

lluismf · June 30, 2022, 5:31pm

Usually backend performance problems are DB related (in my experience). Bad SQL queries, misuse of Hibernate etc. These are relatively easy to detect and solve, no profiler needed. But I guess this is not the case, right?

ibacher · June 30, 2022, 6:37pm

No, you’re right that many backend issues, including likely in OpenMRS, are probably the result of inefficient database queries. That was just intended as an analogy to the type of knowledge necessary to troubleshoot frontend performance issues (which is really what this post is about), e.g., page loads that take >=1s.

grace · July 20, 2022, 10:53pm

p.s. - Worth noting the resources and approach the Bahmni folks are following, documented here. Performance Benchmarking and Capacity Planning https://bahmni.atlassian.net/wiki/spaces/BAH/pages/3038445574/Performance+Benchmarking+and+Capacity+Planning#like-section

grace · July 21, 2022, 2:58pm

Update from connecting with @eudson today about this:

OHRI team is finding that performance in the OHRI demo is suffering so much that it’s causing need for weekly server re-boots and regular cache-clearing. (@larslemos @alaboso any additional detail you want to add?)
Surprisingly though, performance seems better in the ohri.o3.openmrs.org combo environment. @larslemos is investigating why this might be. @ibacher when you are back from vacation next week, could you help with troubleshooting?
@alaboso has been wondering why there’s a redirection to /spaa instead of /spa → Thread: _____
@larslemos has discovered there’s a component taking too much time to load which is what’s causing the location picker to feel like it takes a long time to load -

grace · October 4, 2022, 6:24pm

New concern today re. cloud performance: @michaelbontyes has been trying to set up the O3 RefApp distro on an Azure instance (since he couldn’t get it working on his M1 as mentioned here). It seems like the frontend might be requiring >1GB. He is only using our default RefApp, using only the demo data, nothing extra.

Michael will try extending over 1GB and see if that resolves his issue. If it does… it means we have some clear memory performance issues to address. We’ll look forward to hearing your finding @michaelbontyes!

CC @dkigen @raff @ibacher @dkayiwa @achachiez

michaelbontyes · October 4, 2022, 8:08pm

I am happy to report that upgrading the Ubuntu 20.04 VM from a 1 vcpu/1 GiB memory (Standard B1s) to 2 vcpus/4 GiB memory (Standard B2s) on Azure resolved the issue. I summarized the cloud infra in a diagram below.

I was able to run the initial setup within 10 minutes and the backend/frontend are now running and pretty stable. I didn’t do any load testing through.

@ibacher, @grace, thank you for your help!

grace · October 4, 2022, 10:55pm

Thank you for this extremely helpful diagram & test @michaelbontyes!

Oy vey… @ibacher does this mean our frontend currently needs >1GB of memory? @raff @dkigen & @jayasanka any thoughts on how we could test/confirm this?

raff · October 5, 2022, 11:53am

Reading the diagram everything is being deployed on the same VM thus sharing 4 GB of memory. I would say that is a minimum needed.

1 GB is definitely way too low to run backend alone. It needs 2-3 GB at least. The frontend alone should be good with 256-512 MB of RAM for thousands of active connections. Bandwidth will be more of an issue in case of frontend at that point.

ibacher · October 5, 2022, 12:57pm

As @raff said, it’s actually the backend that consumes most of the memory and actual usage depends on things like the number of concurrent users. E.g., on my laptop, the backend consumes a bit under 1 GB with at most a 1 concurrent user. dev3 currently uses around 2.2 GB. On the other hand, the frontend on dev3 currently is using <10 MB.

grace · October 5, 2022, 2:31pm

Thanks both! This is very reassuring, I think - what do you think @michaelbontyes?

Wow 2.2GB is so different from 10MB - @ibacher why does your machine’s frontend require so much more memory on your dev3 instance?

ibacher · October 5, 2022, 2:41pm

The 2.2 GB number is for the backend on dev3… I was using it as an example of a much more heavily-used instance

grace · October 5, 2022, 4:56pm

Meaning… 100’s of users? Millions of patients?

ibacher · October 5, 2022, 5:07pm

Definitely many fewer patients (< 100). The real driver for memory consumption, though, will be number of simultaneous users, which we currently don’t have a great way to capture. In any case, at any kind of scale like that, we’re hopefully talking about having a dedicated DB server and a beefier instance. To put it simply, to run something like AMPATH’s instance, you’d need a similarly beefy server.