EMR getting terribly slow on opening patient queues page

sudhamsh · September 27, 2018, 5:50am

One of our deployments reported that the EMR becomes very slow and unusable for a few minutes. Issue persists until they restart MySQL.

On Analysis, we found that the patient queues page is causing heavy CPU spikes and if more than 1 user parallelly loads the page - CPU utilization spikes up to 99.5% and sometimes to 100%.

This is resulting in a few users not getting page loaded and for some, it is very slow.

Currently, we have 11 queues loading in the patient queues page at a time on opening programs page.

Regards

pramidat · September 27, 2018, 6:59am

One thing we have observed is all the queues get loaded after landing on the page, though we only need counts of all queues and results of one highlighted queue. We can see if returning only counts for the queues that are not highlighted can improve the performance.

snehabagri · December 3, 2018, 10:06am

Hi,

We did some good amount of analysis on the performance of the queues and discovered that too many hits going out from the programs tab leads to heavy CPU spikes.

In order to overcome this we would like to propose the following solution:

Make the calls to the queues in a serial fashion, so that it does not send too many requests at the same time (this could be behind a feature toggle)
Debounce ( Creates a debounced function that delays invoking func until after wait milliseconds have elapsed since the last time the debounced function was invoked.), the calls to patient search api when someone randomly keeps clicking on the tabs. Today, we keep making patient search api calls with the same params in split seconds, which really does not add any value. Restricting it would improve the performance and prevent unnecessary network calls.

@swathivarkala @binduak @pramidat @angshuonline

arjun · December 3, 2018, 10:30am

I remember i had pointed out this issue and it was done sometime back (at least a year back). I hope you are on the latest bahmni. If it’s happening again, may be its a regression.

Having 11 queues showing up for each user is probably a design problem. May be rather than fixing performance (as it might be more effort for returns beyond a certain point) a feature to be able to configure and show user specific queues should be added to solve this problem.

snehabagri · December 3, 2018, 11:51am

Hi @arjun,

I see unnecessary calls getting made multiple times without actually using the data from the previous call a code design issue. So using debounce actually solves that.

We already have privileges set for each queue, only few high privileged users see all the queue. But making unwanted server calls to the server hits the performance.

angshuonline · December 4, 2018, 7:09pm

@snehabagri thought we decided multiple options for optimization. Is that articulated/captured somewhere?

snehabagri · December 5, 2018, 10:28am

Hi @angshuonline,

Yes we thought of the following approaches:

Refactor Queries

Result: Most queries were already optimized, were able to bring down processing time from 5sec to less than 2sec for 3 queries. That did not improve the performance of queues.

Use Yourkit trial version to profile the Java process; see which function is taking most time and causing CPU spikes.

Result: Openmrs handles the creation of connection, so we could not play around with connection pooling and statement pooling configurations. But we discovered that ‘statement.executeQuery’ took the maximum amount of CPU time. Too many calls to this method at the same time caused the cpu to spike to 100% utilization and the application could not recover from it for sometime.

Check if we are have binding issues like use of - bind twice or bind once etc.

Result:

Checked data binding, it was already one way binding so nothing to improve there
Tried infinite scrolling, Loading limited data first and on click of “Load more” more data will be loaded. But performance was the same

Try UI side caching - we need live data on every click.

The final conclusion was to reduce the no. of concurrent calls to the patient search api, so we implemented that using debounce.