Case Study: Big Data Implementation for OpenMRS

Proposed Big Data Implementation for CHITS
Project of the National Telehealth Center, Philippines

Problem
The current implementation of CHITS relies heavily on the use of observation to store all patient data while mapping it to the concept dictionary. Checking on the database design of the observation table, searching a data within will give us a performance of O(n) (a table scan has to inspect every record which means it will scale with the size of the table).

So for 1,000 visits in a month with an average of 17 fields per form, a specific health center will generate 119K+ rows. Multiply this by the number of health center one city services, (72 health centers) it will generate around 8.5M rows per month.

Proposed Solution
Since the current design grows the database vertically exponentially. We want to minimize growth by growing it horizontally by applying idea and concepts (not the actual technology) used in NoSQL and NewSQL.

Applying the horizontal approach means it would spread the form fields into their own separate tables. Again the goal of the horizontal approach is to minimize the growth of the database vertically.

To compare the vertical growth in six months time, we simulate again based on the same example data (see the link below for the full report) comparing the OpenMRS obs and the proposed solution. As observed, we were able to limit the exponential growth of the database by removing the number of fields in the computation as we already spread it horizontally.

The bulk of the work now is more on the querying of data using SQL statements as it will be more complex as you will need to access more tables instead of just one single table.

OpenMRS obs table relies on the concept dictionary to define the unique concepts for both questions and answers. In CHITS, it is used to formalize all the concepts so that it can easily apply interoperability functionalities.

Applying the horizontal approach to obs will have an impact on the concepts as they will become the column names to the new tables that will be created. To manage the concept mapping, a lookup table will be created to map these column labels to its corresponding concept id.


Read the complete paper here: http://bit.ly/CHITSBigData

We appreciate any feedback and any questions on our proposed solution.

1 Like

Thanks @rvregalado for sharing! :slight_smile:

Is this proposal, a model for only reporting purposes? That is, extract data from the transactional database into this new one such that reporting and analysis tools can run off it?

If not so, in other words, if you are proposing to use the same structure for both, did you get a chance evaluate the effect on the api which saves, deletes, and queries observations?

Did you do some sort of prototype or proof of concept by trying to implement this and get an experience of the details which will be involved, beyond just the proposal?

Will give more feedback after reading the paper that you have linked. :slight_smile:

1 Like

Hi @dkayiwa thanks for the feedback.

This is an ongoing work so we appreciate any feedback on our proposed model. And moving forward how we can apply these big data technologies to the current OpenMRS framework

The model is not only for reporting purposes but for the entire process of capturing, saving and retrieving data. We already made a proof of concept of the proposed model.

Our next steps are as follows:
1] Build a module in the current OpenMRS structure
2] Build a module following the proposed model
3] Compare response time of the modules thru database growth simulation.

We’ll try to come up with a report that details our implementation.

1 Like

@rvregalado This is very interesting to me because we are considering OpenMRS or Bahmni for a very large hospital that will eventually serve over 10,000 patients per day in person, and possibly another 10,000 per day via telehealth. Did you ever come up with a report that detailed your implementation?

1 Like