Bahmni: Performance on a remote server

mksrom · June 20, 2016, 11:23am

Hi Bahmni team!

While setting up Bahmni on a remote CentOS server, I’ve noticed that the performance is not so great. It takes very long to load pretty much any page. For instance, loading the patient search page /bahmni/clinical/index.html#/default/patient/search take between 30 and 40 seconds

The server I’ve set up matches the official requirements:

CPU x4
RAM 8GB

In order to find out what happens in more details, I’ve compared the performance of a page of OpenMRS while running from a Bahmni server VS the same page on a server that runs OpenMRS Ref App only.

Let’s load /openmrs/admin/index.htm on the two different servers and see what happens…

Bahmni server: Total load time is 8s

And below the specific case of retrieving the index.htm document (1.64s)

OpenMRS server: Total load time is 2s

And below the specific case of retrieving the index.htm document (700ms)

I see two problems:

####1/ All data loaded from the Bahmni server is SSL encrypted.

This applies to all JavaScript resources, CSS, icons, fonts, etc, while I guess it should be restricted to sensitive data only. Therefore, every resource takes around 500 more milliseconds to load.

2/ 530ms is spend on ‘Stalled’ status.

This applies for every resource as well. But it is specific to Chrome (I am running Version 51.0.2704.84). In Firefox or Safari, this ‘Stalled’ time doesn’t exist:

And as result, the page takes 4s to load on Firefox!

Any idea?

Thanks.

gsluthra · June 20, 2016, 11:54am

Can you please share more details – like for instance, geographically is the client & server in the same location / country?

Also, I don’t think its an issue loading resources over https. In fact, its common and recommended for security to have as many resources as possible loaded over SSL (to avoid injection). If you see GMAIL, it also does the same.

I also notice that all JS assets seem to be fingerprinted (someone else from Dev team can confirm this)…and hence ideally the cache time for resources should be set to a large value (30 days / 1 year), since on change of the resource, its fingerprint anyways changes. But, from my analysis, it seems max-age is set to 1. If we change this, we should see a performance improvement in loading time.

We should also fingerprint image assets, and then set their cache times to longer periods. Right now icons / images don’t seem to be fingerprinted, and again max-age=1 (second!).

Gmail:

Facebook:

*

Bahmni:

Sure. I am currently in Cambodia, with a 10Mb/s stable connection, and the server is located in US West cost. Ping to the server is around 245ms.

I know this is not great.

The performance is improved when re-locating the server in Singapore.

Ping to server in Singapore is 90ms and the same page takes 5s to load. Still not good.

For info, the patient search page in Bahmni /bahmni/clinical/index.html#/default/patient/search takes 23s to load on the server in Singapore, and more than 40s on the US server

gsluthra · June 21, 2016, 4:53am

Thanks for your research. This is interesting (Orange and Purple for each Bahmni call).

I did some more research. Turns out that Orange/Purple represents Connection Initiation & SSL Handshake time for a resource. More details here: Chrome DevTools - Chrome for Developers

(Yes. the color in the image is shown as orange instead of purple – its a mistake in documentation).

Then I researched more on why Bahmni is showing this color for each. And came across these two links: http://www.semicomplete.com/blog/geekery/ssl-latency.html

When I try the command suggested by the first link on a Bahmni resource, it turns out KeepAlive is OFF. See the second line for Bahmni URL, the tcp values are not 0 (like they are for Facebook)

This means that for each resource a new connection is being initiated (since the previous connection isn’t kept alive) Then I realized that the Apache httpd.conf has KeepAlive Off (as mentioned in article - this is default for Centos). So, I did the following: Made KeepAlive On in the Apache config file:

vi /etc/httpd/conf/httpd.conf

sudo service httpd restart

Now, I don’t see the Orange & Purple for each resource!

BEFORE (KeepAlive Off)

AFTER (KeepAlive On)

Can you try this out?

In Summary, I think Bahmni Team should do the following:

Change KeepAlive to ON in their installation for Apache httpd server.
Fix the Cache Control header for JS resources to 30 days atleast.
Fingerprint images, and fix the cache header for those to 30 days atleast.
Check other static resources similarly.

mksrom · June 21, 2016, 7:15am

I’ve tried it and the loading time is waaay better now!

Reloading the page /bahmni/clinical/index.html#/default/patient/search now takes 8 to 10 seconds Which is 25% of the time it took before the KeepAlive On. Thanks @gsluthra

Now, I notice that it takes slightly more time when loading the page for the first time, than reloading it. I guess this is because of the KeepAliveTimeout parameter that is set to 15s and it needs to SSL handshake on the first resource.

Here is more info:

###Loading time with Chrome:

/openmrs/admin/index.htm page
- for the first time = 3.90s
- on reload = 2.5s
/bahmni/clinical/index.html#/default/patient/search
- for the first time = 11s
- on reload = 8.0s

Loading time with Firefox:

/openmrs/admin/index.htm page
- for the first time = 2.7s
- on reload = 1.6s
/bahmni/clinical/index.html#/default/patient/search
- for the first time = 7.8s
- on reload = 7.1s

All is much better now

In my original message, I also mention that there is a ‘Stalled’ time spend in Chrome during which nothing happens and that doesn’t happen in Firefox/Safari. Now, I am not sure if it is just because Firefox & Safari include this time in the ‘Connecting’ and ‘Waiting’ duration and therefore doesn’t show it in its console, but in the end, the loading time is still longer in Chrome.

We can switch back to KeepAlive Off to accentuate the problem and try to find out what happens.

Loading time for the first resource (index.htm document only (not the page)):

Chrome = 1.5s
Firefox/Safari = 1.1s

Any idea what that could be?

gsluthra · June 22, 2016, 12:26pm

I created a Mingle card to track the items we identified. Here is the link: https://bahmni.mingle.thoughtworks.com/projects/bahmni_emr/cards/1924

I believe if we fix the Cache headers on the static assets, browser will start caching more resources, and overall load times should improve.

Someone from team will need to investigate your final question on stalling / load time between Chrome and Firefox. But for now we only support Chrome (and maybe since Firefox is doing things fast, its skipping some stuff that Chrome doens’t )

tomgriffin · June 22, 2016, 3:26pm

I’ve done work in the past on international websites and wanted to share data around how distance plays a role in latency and speed.

It’s not an exact science and there are many things that can impact performance - but it may help when communicating why things are slower over larger distances.

mksrom · June 22, 2016, 3:47pm

Thanks guys for looking into that and being very responsive !

@tomgriffin,the production server will be located in Singapore. From there, the performance is much better.