Designing the OpenMRS OHDSI module

I wanted to share with you some work that @maurya and I have been doing on connecting OpenMRS with OHDSI: http://www.ohdsi.org/. Basically, OHDSI is a collaborative to bring out the value of observational health data through large-scale analytics. Basically, its a way of using a common data model to to perform analytics on OpenMRS clinical data… so yes, a bit similar to DHIS2. You can find more about OHDSI on:

Now, to get OpenMRS working with OHDSI, we need to export OpenMRS data into a CDM compliant postgres database where the analytics are run. We’ve already hacked our way into supporting this for Achilles (an OHDSI app). A demo of this work can be seen at: http://45.55.236.234:8080/AchillesWeb-master/#/SAMPLE/dashboard

We’re currently trying to transform our “hack” into a more production worthy application. For this purpose, we plan to build an OpenMRS module, that can either:

  • connect to a remote CDM compliant postgres database, and post data into it, or
  • generate a dump file that can be used to populate the CDM compliant postgres database.

Initially, we didn’t want to run a resource intense process on the production system, and were in favor of performing a manual mysql dump to move data out of the production db. But having a module significantly reduces the complexity, and is a cleaner way to do things.

But anyway, happy to hear any suggestions on our design as we plan to start building this module over the next few days…

2 Likes

Suranga if I have understood you correctly, you mean the module will be exporting data into a dump file and upload it to some server and then someone will be in charge of running the dump manually ( or may be create some cron job that checks for new files in some folder and run them?).

If that is correct, I think the module wont be so helpful as one would also need to perform several other setup external to OpenMRS to make things run. Why not create an entirely different application that will be connecting to OpenMRS database, retrieve changes and update the postgres periodically? Wouldn’t it be nice to configure one system and let it handle the whole thing?

Hi @willa,

Well, the dump file would be a secondary option - our primary attempt would be to post data directly into a CDM compliant db that is already setup, and waiting for data. But another advantage of having this setup - we can allow users to export data based on what tables they want to populate, and what OHDSI apps they want to use. So in a way, we can ask users for simple input, and perform complicated tasks under the hood.

It looks like you need some kind of ETL tool. I’ve heard good things about Talend (open source), but I don’t know if it can be integrated in a java webapp.

Hi there,

Sorry, this response is way overdue. So OHDSI does provide an in-house ETL mapping tool named whiteRabbit. Unfortunately, what this does is allow users an easier way to map two schemas together- we’ve used it, and it only outputs a word document with a description of how the mapping should be carried out. Right now, we’ve been trying to do the mappings ourselves using this document. We haven’t really looked at any other tool such as Talend…

@surangak or others,

Has any further work happened on the OHDSI module? I’m interested in playing around with this and maybe contributing a bit.

Hi @darius, I blush to say that we haven’t done any work on this module as of late. We’ve been meaning to get back to it eventually, but with the amount of work on everyones plates, its just been too hard. But if we can’t work on it any more, then I plan to make it a GSoC project for 2016…

Right now, if you want to look into contributing / organizing contributions, you should check out https://wiki.openmrs.org/display/projects/OpenMRS+Support+for+the+OHDSI+Project

As you’ll see, we currently populate OHDSI person, provider, observation_period, visit_occurence and condition_occurence tables. (refer to the image on the wiki page to the logical flow of why these OHDSI tables were picked, and populated)

Potential contribution ideas for you,

  1. Check the code, and validate our modeling decisions
  2. Try to make the module more configurable, and less hard coded
  3. add support for new OHDSI tables.

Mapping between the OpenMRS database and the OHDSI CDM schema were completed by experts during a face to face meeting. You can find this at: https://wiki.openmrs.org/download/attachments/84476300/OpenMRS%20ETL.docx?version=1&modificationDate=1436743360000&api=v2

I really appreciate seeing the work and design that has gone into this. OHDSI is potentially very exciting for OpenMRS, and looking at what you all have done so far really helps frame my thinking about this! And it saves me from some background reading.

Various thoughts:

It’s important to figure out the realistic use case for OpenMRS+OHDSI. My gut says that the target should be sophisticated implementations, who care a lot about research, have a lot of data, and are already managing multiple servers for analysis purposes. So I’m not convinced that focusing on a module is helpful, and I would lean towards creating an ETL process or feed-based process that runs outside of OpenMRS.

Particularly, the “generate a dump file” use case seems like it might be helpful to play around, but any real implementation that cares about OHDSI has a large enough dataset that this will break.

(If you’re going to be using a module to generate the dumps, I recommend pulling data via the reporting module, rather than raw SQL queries written from scratch. For example the download servlet doesn’t exclude voided in any of its queries. Tsk, tsk.)

BTW, an important use case will be for implementations that run multiple servers in different hospitals/clinics and want to aggregate their data into a central OHDSI instance, so whatever solution is going to need to handle unreliable internet connections.

This is gold!

It seems that you’re writing this to presume the CIEL dictionary (so far). I don’t know if that’s good enough, but at the very least, you should hardcode against metadata by name or uuid, not by its PK. (E.g. don’t code against concept_class with id=4, but rather with name=“Diagnosis”, or its UUID.)

Generally speaking, I feel like the work we’re going to have to do to “generalize” this is going to have very significant overlap with what we have to do to make the FHIR module work against a general OpenMRS implementation. So…might it make sense to have the conversion from OpenMRS to OHDSI actually go through FHIR?

Definitely we can’t sidestep the need to map OpenMRS dictionaries to the OMOP vocabulary, and this is where we can have a communal effort to map CIEL, and non-CIEL implementation should be able to leverage this by mapping to CIEL (or else they’ll have to do the mappings themselves).

A few specific code comments:

  • don’t build long strings by doing “string” + “concatenation”, but always use a StringBuilder.
  • get in the habit of writing much shorter methods
  • Unit tests! TDD!
  • If you’re going to hardcode something, use well-named constant for it. E.g. don’t ever do this, even in throwaway code: valueCodedClassCheck!=4 && valueCodedClassCheck!=12 && valueCodedClassCheck!=13

Btw (in the spirit of you saving me from having to do real research) am I reading right that OMOP has a fixed concept dictionary? What happens when people want to do analysis on some implementation-specific concepts? This seems like a very common OpenMRS use case, and I would be afraid that the central OMOP vocabulary won’t have good coverage of developing country needs.

Can I ask if anyone is still working on OHDSI/OMOP? I have taken on the CIEL mapping to OHDSI and there are now AWS instances that can be spun up with all the tools and databases. If there is a way that we can configure the ETL discussed here with the CIEL concept table, I would like to explore standing up a reference site. Anyone interested?

@akanter: I’d reach out to Lee Evans (evans@ohdsi.org), who I think would be delighted to have a strong collaborator like you to work with.