Interoperability Layer for OpenMRS

This is truly an outstanding discussion!

My impression is that although convenient, the openmrs events module suffers from two challenges. First, via a customized approach, it’s taking on the complicated task of event tracking. Second, it is home-grown and therefore depends on our community’s ability to keep maintained.

I believe that we would be better off delegating this responsibility of change detection to a third party library, like Debezium, which is dedicated to solving this problem and robustly maintained by a large community.

Integrating a debezium workflow poses (at least) two challenges. First, many OpenMRS implementations depend on the war based deployment strategy rather than a stack/microservice/container based approach. This provides simplicity but lacks flexibility. Second, even if one were to use debezium, we lack an existing tool to convert the debezium data stream of table based changes into domain objects (e.g. a patient or an encounter), something the openmrs events module makes easy. Given how data is distributed across tables in OpenMRS, though solvable, this is not as simple as, changed detected → create new fhir object.

I applaud the effort made by @mseaton to create an openmrs module with debezium. This may solve problem 1 above for many implementations. For those like ampath which use a container based deployment strategy, we could simultaneously pursue that option for deploying Debezium.

Now, if we could solve problem 2, this may make it much easier to, as a community, embrace the debezium approach. I think we could begin solving this problem independent of which software is chosen to receive the debezium messages (Apache Camel Vs OpenHIM vs ??) as all these libraries seem to allow you to write the transformer layer in whatever code you want.

I’ve discussed with @ibacher and he is interested in spiking on this in the coming weeks. Our initial use case is essentially receiving debezium messages and converting to FHIR objects via the FHIR2 module. Perhaps others have code already achieving this? Please do share if that’s the case. @bashir, thanks for providing this link . Could you speak a bit more about your approach to taking a table name then determining which object needs to be created?

Again thanks to all for this great discussion.

@jdick see this for example: GitHub - ozone-his/eip-openmrs-senaite: Camel routes that integrate OpenMRS and SENAITE.

@ruhanga can provide more infos on the inner workings.

2 Likes

I guess @rcrichton is best to answer , but I can try.
OpenHIM and Apache Kafka are both systems that can be used to manage the flow of data between different systems. However, they are designed to serve different purposes and have different strengths and weaknesses.

OpenHIM provides a centralized and flexible way of integrating health information systems. It provides a range of features for managing health information exchange, including support for security, auditing, and routing.

Apache Kafka, on the other hand, is designed to handle high volume, high throughput, and real-time data streams. It can be used to process and analyze data in real-time and to publish and subscribe to streams of records.

OpenHIM and Apache Kafka can complement each other. OpenHIM can provide the central coordination and management for health information exchange, while Apache Kafka can provide the underlying event streaming and processing capabilities. The two systems can be integrated to provide a complete solution for health information exchange that combines the strengths of both systems.

it’s worth noting that while OpenHIM can support message queuing, it is primarily not a message queuing system in and of itself

3 Likes

Great to see folks share their work here.

In my opinion the primary common thing about what folks have shared is that they need to track changes in OpenMRS and react to them, how they react to the changes differs between implementations somehow since they have different integration needs and workflows.

The event module clearly has holes in it because it is based on a hibernate interceptor approach which makes a wrong assumption that every change goes through the JPA layer in OpenMRS, the debezium based approaches are clearly the proven way to go to reliably track changes in the OpenMRS database and this is what we need to bring together into a reusable component, this is why I think @mseaton and @mksd do make sense to have a debezium based events module or third party solution that emits DB events, and it needs to be kept simple without clattering it with other features and tools that are not needed by everyone, sometimes bringing together a suite of many ‘optional’ technologies can be overwhelming and actually a turn off for some implementations to adopt. #letsKeepThingsSimple.

2 Likes

While I would love to see a robust and scalable data warehousing solution delivered out of the box with OpenMR, I realistic initial step in that direction would be to build toward alignment on the first piece needed: getting data out of OpenMRS.

Leveraging a tool like debezium has a couple of benefits: (1) more robust than Hibernate interceptors as @wyclif pointed out and (2) the OpenMRS community can focus our efforts on adapting widely-used tools rather than building, owning, and having to sustain more bespoke tooling (i.e., leverage the work of other communities already addressing the low-level issues of connections, pooling, queuing, etc.). The challenge with debezium is doing the work to provide hooks for clients to listen not only for database-level events, but also domain-level events (e.g., patient changed). We also need an approach that doesn’t require clients to call the OpenMRS API to fetch every object that has changed (either by streaming out the data – not just references – or coming up with more efficient ways to access resources in bulk), since asking the API to marshal all observations for even a hundred patients can produce a denial of service attack on our FHIR API module. :slight_smile:

We’re getting closer to being able to deliver the OpenMRS Platform as a stack rather than a war file, which will make it easier to share functionality that goes beyond a single java virtual machine. It would be awesome if we could create a proof of concept where a container or two brought up alongside OpenMRS could provide secure endpoints to listen for high value data- or domain-level events and a demonstration of the potential showing how it can be used to generate & maintain near real-time flattened tables (e.g., hello world of flattened patient & flattened encounter tables) that don’t have to live in the same database as OpenMRS.

It sounds like this will the topic of today’s TAC call – with Andy trying to fit in another agenda item. Unfortunately, I have a conflict and will miss at least the first half of the call… but I’ll be looking forward to reviewing the recording.

1 Like

The OpenHIM does a good job of routing messages synchronously which is what it was built for. I.e. the case where a response is needed right away. Over time we have found that for data were we don’t need a real-time response (data for secondary use), it is more scalable and reliable to stream this data to a queue and process it in near real time. This allows Kafka to act as a buffer to traffic spikes as well as provide a benefit of not pushing load to downstream services since Kafka allows services to pull data from it as and when they are able to process more data.

So, like was mention in this thread already, we believe the two technologies are complementary to each other.

1 Like

Well put @burke , and your view is the reason I have several times suggested we modify the events module to depend on a debezium based module or use debezium directly so that integrators can have the option to also stream ‘entity level’ events from OpenMRS, it should be possible for a developer to transform debezium change events to domain level events.

Recording from today’s TAC call on creating an interoperability layer for OpenMRS:

5 Likes

I’ve also added notes from today’s meeting here Technical Action Committee (TAC) Meeting Notes - Resources - OpenMRS Wiki

Thanks for an interesting topic and sharing your solutions!

Given different struggles we had with events based on Hibernate interceptors and AOPs around our API service methods I would say that Debezium is the way to go these days. In fact the struggles were not only ours and thus Debezium was created. It can be embedded, standalone or run in a Kafka cluster. It’s up to an implementation to decide on the deployment that works best in a specific scenario. It goes down to scale and reliability vs complexity.

One discussion topic was how to aggregate events e.g. creation of an encounter with many events emitted for the encounter and obs table and if it can be a shared solution.

One common approach is to have a time based window i.e. wait for related events over a specific period of time and transform at the end. However, it’s also possible to have Debezium emit transaction events and do the aggregation on the transaction level. When using Kafka Streams it is supported out of the box through session windows (which are transaction based).

Even though the idea of writing a general purpose aggregation from OpenMRS schema based events to FHIR events is very tempting the implementation is entirely different in different deployment scenarios.

In Kafka deployments one would go with Kafka Streams to do aggregation using session windows (transaction based) and emit FHIR events.

In embedded scenario I would use Camel to do aggregation based on transaction events with e.g. embedded Inifinispan store for aggregating db events for the lifespan of a transaction and until FHIR message is emitted.

In the standalone scenario one would send Debezium events to a message broker of choice and do aggregation afterwards.

In all three scenarios we achieve pretty much the same thing from the business logic perspective. We process DB events asynchronously and emit aggregated domain level events (e.g. FHIR messages). The embedded scenario has the lowest scalability and reliability as its tightly coupled with an OpenMRS instance, whereas the standalone and Kafka scenarios can be scaled to any load of data and provide better data consistency protection, which is of course at the cost of complexity of deployment and maintenance.

As @burke noted in no scenario we should be getting back to DB to retrieve domain model representation upon receiving an event. All data we need is readily available in DB events.

One part that we could all benefit from and that could be re-used across different scenarios is:

  1. Having a single implementation of the OpenMRS event model (domain model classes)
  2. Having a conversion algorithm from the OpenMRS event model to the FHIR model.

The OpenMRS specific event model wouldn’t be exactly the way our API model is implemented, rather it would be POJOs that can be extracted blindly from Debezium events. I think it is the mapping from the OpenMRS model to the FHIR model that requires the most work and thought and it would be great to just plug it in at the end of our aggregation regardless of the deployment scenario. Not sure if that part can be already sourced out of the FHIR2 module.

It would be also great to have embedded and Kafka scenarios open-sourced and shared as I think those will be the most common ones and the code will not differ between implementations.

3 Likes

Hi all,

This is a super interesting discussion, and I think there is a lot to chew on here. In particular in the UW DIGI current interoperability stack we are working on our pipelines for secondary data use, but it’s super encouraging that most of our current interop setup seems to be the same as others shared. I’ve very slightly adapted this diagram from a specific presentation, and I’ve put the reference tools for certain things (like using OCL for terminology, and GOFR for facilities) where appropriate.

Given the similarities in the underlying technology, I’m looking forward to ensuring that OpenELIS Global 2+ and OpenMRS continue native interoperability. We have a video up showing the out-of-the-box connectivity with OMRS3 and OE Global 2.

4 Likes