Optimising OpenMRS for inserts: We need your advice

rcrichton · June 12, 2015, 8:05am

Hi all,

As part of the OpenHIE initiative we are trying to create a Shared Health Record out of OpenMRS. Basically what we need to do is break down CDA documents and create obs out of them. This becomes very obs insert intensive. Currently 50% of documents insert in <800ms but our 90th percentile is around 3s and the longest request is around 6s. We are trying to improve our performance, preferably to around the 200 - 500ms range.

We seems to be database constrained from the profile that we have done so far. Here are the queries taking the most time with our performance improvements applied:

# Profile
# Rank Query ID           Response time  Calls R/Call V/M   Item
# ==== ================== ============== ===== ====== ===== ==============
#    1 0x82581ACD856A813F 129.0220 59.7%  6724 0.0192  1.05 INSERT obs
#    2 0x813031B8BBC3B329  30.4854 14.1% 14175 0.0022  0.52 COMMIT
#    3 0x8AB1AEEE6719658B   8.3641  3.9%  2077 0.0040  0.00 SELECT obs shr_obs

We seem to be severely insert constrained. We have tried few things so far to get to this point, these include the following in order of biggest performance gain:

Set GP search.caseSensitiveDatabaseStringComparison to false
Ensure hibernate jdbc batch options are enabled in OpenMRS (see PR https://github.com/openmrs/openmrs-core/pull/1508).
Disable OpenMRS object validation (see PR https://github.com/openmrs/openmrs-core/pull/1503).
Tune InnoDB by following this guide: https://www.percona.com/blog/2007/11/01/innodb-performance-optimization-basics/

We have also tried a few thing that had no effect for us:

Try get rid of the CIEL concepts we don’t need and see if that makes a difference - no difference
Try MariaDB - no difference
Try MariaDB using the TokuDB engine - no difference

Our next step is to try make sure each document insert happens in one big transaction, but after that we are out of ideas. We aren’t SQL performance tuning experts so we wanted to ask if anyone has any other ideas for what we can try next?

Does anyone has some suggestions for us?

Thanks, Ryan

lluismf · June 12, 2015, 8:32am

A couple of ideas. Is the cardinality of the obs table very high? At some point (even with state of the art databases) you must partition very big tables to get decent performance. I don’t know if MySQL supports partitioned tables though.

Are you loading thousands of obs? Try to drop the indexes in obs table, perform the load and recreate at the end. Depending on the number of obs before and after the load this can be slower.

Using a big transaction won’t make it faster because the operations have to be stored in the rollback segment. But smaller ones (load the obs in chunks) will probably help. The drawback is that if one fails you lose the whole chunk.

darius · June 13, 2015, 12:40am

Out of curiosity are you using the latest version of MySQL? (PIH got absolutely massive speedup when upgrading Mirebalais from MySQL 5.5 to 5.6, though that was about many-outer-joined reporting queries.)

sunbiz · June 15, 2015, 4:46am

@rcrichton, For insert intensive work, disabling indexes might be a good idea. InnoDB is infamous for the way in which it will index for every insert, which is why bulk import, using transactions or LOAD INFILE techniques will work much faster. If you can wait a bit and write to file and then do LOAD DATA INFILE, you will be able to get better insert performance… But if writing to file is as slow and you want live/real-time data, then this doesn’t help.

Like @darius suggests, MySQL 5.6 has some excellent performance improvements. Beyond the InnoDB buffer, you should try to set innodb_flush_log_at_trx_commit = 2, http://dev.mysql.com/doc/refman/5.6/en/innodb-parameters.html#sysvar_innodb_flush_log_at_trx_commit , which is more safer now.

On the hardware front, there might be a surprise hidden in SSD drives. Although you might think that it will improve performance, cheaper SSD controllers are bad with database heavy write performance

rcrichton · June 15, 2015, 7:00am

Thanks for the tips all. Just to note that what we are trying to do is achieve good insert performance for a transactional system so I don’t think we will be able to disable indexes. It isn’t a one time import but load that the system should expect on a day-to-day basis.

Partitioning tables is an interesting idea that we could try, I don’t have much experience with that.

I did my benchmarking on mysql 5.5.43 so perhaps I should try again with 5.6. Sounds like there may be some good improvements to be had.

Thanks for the advice, if anyone has anything more keep the comment flowing!

dkayiwa · June 15, 2015, 7:12am

If the worst comes to the worst, by pass hibernate and do JDBC!

lluismf · June 15, 2015, 8:20am

You can try also storing the transaction log in a different disk than tables and indexes (its access is sequential instead of random).