Proposed Big Data Implementation for CHITS
Project of the National Telehealth Center, Philippines
Problem
The current implementation of CHITS relies heavily on the use of observation to store all patient data while mapping it to the concept dictionary. Checking on the database design of the observation table, searching a data within will give us a performance of O(n) (a table scan has to inspect every record which means it will scale with the size of the table).
So for 1,000 visits in a month with an average of 17 fields per form, a specific health center will generate 119K+ rows. Multiply this by the number of health center one city services, (72 health centers) it will generate around 8.5M rows per month.
Proposed Solution
Since the current design grows the database vertically exponentially. We want to minimize growth by growing it horizontally by applying idea and concepts (not the actual technology) used in NoSQL and NewSQL.
Applying the horizontal approach means it would spread the form fields into their own separate tables. Again the goal of the horizontal approach is to minimize the growth of the database vertically.
To compare the vertical growth in six months time, we simulate again based on the same example data (see the link below for the full report) comparing the OpenMRS obs and the proposed solution. As observed, we were able to limit the exponential growth of the database by removing the number of fields in the computation as we already spread it horizontally.
The bulk of the work now is more on the querying of data using SQL statements as it will be more complex as you will need to access more tables instead of just one single table.
OpenMRS obs table relies on the concept dictionary to define the unique concepts for both questions and answers. In CHITS, it is used to formalize all the concepts so that it can easily apply interoperability functionalities.
Applying the horizontal approach to obs will have an impact on the concepts as they will become the column names to the new tables that will be created. To manage the concept mapping, a lookup table will be created to map these column labels to its corresponding concept id.
Read the complete paper here: http://bit.ly/CHITSBigData
We appreciate any feedback and any questions on our proposed solution.