Hi everyone,
There is an interest in building a shared/unified solution for the ETL/Analytics needs of OpenMRS (e.g., see this post by @burke, the thread following it, and the two-meetings following it).
As some of you know, I have been looking into building this solution on top of FHIR. To get some feedback, I wrote this doc and shared it with some folks for early feedback last week. Now, I would like to hear from everyone. I feel the doc format is easier to comment on minor details but please feel free to use this thread for more high-level discussions.
Here is a copy of the list of pros/cons as I understand them; this is not a replacement for reading everyone’s comments and other details in the doc (so please do read that doc if you are interested in this topic):
Pros:
-
The main benefit of using FHIR for analytics is standardization. This makes it easier to integrate OpenMRS data with other systems that can speak FHIR.
-
Another side effect of standardization is that data scientists do not need to understand the OpenMRS data model. In general, to work on OpenMRS analytics, one only needs to understand FHIR, which is well documented.
-
Again as a standardization side effect, analytics tools developed to work off of FHIR can be applied to OpenMRS analytics workloads without too much effort [with caveats].
-
To be able to do analytics on FHIR, we need to develop pipelines that translate OpenMRS changes to FHIR resources (both in batch and streaming modes). These pipelines are useful beyond the analytics use cases. For example, if we want to export OpenMRS data to a FHIR store (e.g., for a Shared Health Record system), we can leverage the same pipelines and mechanists for Analytics on FHIR.
Cons:
-
The main disadvantage is more complexity. It is true that FHIR is well documented but still it is a huge standard and OpenMRS only uses a tiny portion of it.
-
Another angle to the complexity issue is more complex queries because of the presence of ARRAY and STRUCT column types.
-
There are already analytics solutions in the OpenMRS community and none of them are based on FHIR (AFAIK). Note that this is not completely a disadvantage for using FHIR. Because our goal is to develop a unified solution for OpenMRS Analytics and we need to unify those custom schemas. The benefit of FHIR is that we simply rely on a standard as the unifying schema.
-
There is definitely an extra overhead to convert OpenMRS data model into FHIR. This can specially make the batch pipeline for exporting OpenMRS data to the data warehouse more expensive.
Adding a few people who have commented or showed interest in this work before (as an FYI): @akimaina, @aojwang, @burke, @ccwhite23, @dkayiwa, @grace, @ibacher, @jennifer, @mksd, @mseaton, @pmanko, @wyclif