Sync 2.0 Discussion about pull/push architecture

craigappl · June 13, 2017, 9:12pm

The technical section posted by @pgesek says that the child will read the atom feed published by master and there will be a push from the slave to the master.

I want to get your opinions on these approaches and discuss the architecture. I’m focusing primarily on the exchange of patient information and encounters.

Slave Pull From Master In the current architecture, the master OpenMRS system would have to post an atom feed that is cohort specific and each slave system would read that atom feed every time they came online. When that clinic reads that atom feed, they would kick off a series of FHIR REST calls that retrieve the patient’s information from the master, the most recent version of the patient record would be returned and imported to the slave OpenMRS database. Each record in the atom feed would act as a transaction and the feed reader in the slave OpenMRS would adjust the marker for each successful transaction.

To accomplish this, we would need to build the business logic in the slave OpenMRS to try to read the atom feed from the master OpenMRS and act when there is an update in the feed. (I haven’t seen evidence that this business logic is built in OpenMRS.)

Slave Push to Master The push to master could be done in a few ways and this is where I think we have overlap with the MPI project.

First, I’d like to make sure we want to do a push of messages from slave to master. We could theoretically have a multi-feed reader on the master OpenMRS that reads the atom feeds from each slave OpenMRS on a regular schedule and queries the FHIR REST API to get updates to the patient record.

Assuming we do want to push messages from slave to master, we get into the area of message queuing at each clinic.

We need to discuss how we create the message that needs to be pushed from slave to master. My current thinking is to raise an event for each database transaction like we currently do with the event module. That event could either A) generate an entry in an atom feed, B) create a FHIR message for that particular message type and post it to master or C) create a FHIR message for that particular message type and store it in an OpenMRS table that could be read by a third party tool.

In scenario A, we generate an entry in the atom feed. We would then need to build the business logic to locally read the atom feed, have it query the REST API and create the FHIR message(s). This would likely be done with a third party tool like Mirth.

In scenario B, we create a FHIR message for each message type and try to push that message to the master OpenMRS. If successful, great. If not, we would need a mechanism to retry on a regular schedule, audit failed transactions and provide administrator visibility into the process. We could build this queuing mechanism natively in OpenMRS or we could use a third party tool like Mirth to manage the interaction with the master OpenMRS.

In scenario C, we generate the FHIR message and locally store it within OpenMRS. In this scenario, we offload the transportation and auditing to a third party tool like Mirth.

I’ve been thinking about this problem for some time. If we go with scenario B and we post these messages to a local third party tool like Mirth, we may be in a situation where the local third party tool is not available. In that event, we would still need visibility into failed transactions within OpenMRS. That’s why scenario C is attractive. It allows a third party tool to do the transportation and mark a message as successfully transported in the event of success.

What do you think about these different architectures?

Craig

FYI @mogoodrich, @mseaton

darius · June 15, 2017, 9:03pm

@craigappl, I tip my hat to you, for writing talk posts as long and detailed as mine!

My first thought is that we only need one feed on the server, but each of the feed elements includes enough info for a client to tell if they are interested in it. (E.g. a sync client would see 100% of the feed, but only request full details for 10% of the items.) I believe that Bahmni has taken this approach for the Bahmni Connect offline app, and we should see how it is working for them.

Actually Bahmni has already implemented most of the feed publishing and reading business logic (for synchronizing OpenMRS, OpenERP, and OpenELIS). That code is mature and used at scale (and also packaged as an OpenMRS module), and we should leverage it here. Here’s a wiki page with some details and links to some code.

Yes, we should absolutely push from slave to master, and master shouldn’t even need to know about all its slaves for the basic algorithm to work. (We might still keep this so that each slave can have its own secret key that can be invalidated if it’s compromised. And in PIH’s experience with sync v1 it’s very helpful to centrally gather enough info so you can debug slaves from the central location.)

We have a choice here. The event module uses a hibernate interceptor to be notified of all db changes. Bahmni’s atom feed module uses AOP to be notified of API calls. I don’t think either choice here affects the rest of the sync design though.

In my opinion, we should treat it as a requirement that you can do OpenMRS-to-OpenMRS sync without needing Mirth.

I think that the slave should produce an atom feed in exactly the way that master does. The slave can read its own feed, but it only moves the marker on a successful push to master. (As a side effect this can allow multi-level sync networks.)

But this is simpler than your scenario A because the FHIR module should make it easy to “generate a FHIR representation of resource X with uuid Y”

Fyi @angshuonline. Also, let’s make sure @pgesek has a chance to wrap his head around the project before we over-design everything.

craigappl · June 16, 2017, 1:57pm

Thanks @darius,

This raises a few more points that we want to consider in the design:

The atom feed may contain sensitive information
To mitigate this, we would need to create access controls for systems to read the atom feed.
Should we focus on database changes and/or API calls?
Are there scenarios where we would want module developers to be able to add events to the atom feed?
We need to clearly work through exception handling, logging and auditing. I expect there will be many edge cases that we will need to account for in this process.

The wiki link isn’t working. I see the work done to generate the atom feed and manipulate the marker. I also see the offline-sync module that reads a feed using a client, raises events and posts those to the third party system. Is this the module?

Thanks, Craig