Thanks @ayeung and @aramirez - we are looking forward to working with FGH Mozambique & Global Brigades during the development of Sync 2.0. Your testing and input on the design/development of Sync 2.0 will definitely help us build a solution that will meet the needs of the community.
@raff and @jthomas - I agree that the next step will be documenting on the wiki (in the Sync space) and organizing a design call to discuss further.
I would also consider the option that although this will be a replacement
for the sync module, we might choose to give it a new name. And since the
codebase will be 100% new there is no clear value to doing this in the old
sync moduleās git repo.
(thought I posted this in response to Mike, but just seeing now that I didnātā¦ )
Since it is going to be a completely new code base, doesnāt it make sense to start a new repo? I guess from a naming convention it does make it easier. Is there any kind of best practice around this type of situation?
That being said, if the general consensus is to use the same repo, Iām fine with thatā¦
So, it seems like large-scale code rewrites prefer to create new repos. (E.g. you would never want a developer to do a git pull and inadvertently go from sync 1.x to sync 2.x.)
I recommend that we delay on deciding the module id and the exact naming/marketing as we do design, figure out what different independent components are involved, what might overlap with a connect-to-MPI/HIE project, etc. (We can continue to call the project Sync 2.0 in the interim.)
I donāt see an issue with doing a new git repo - it has some advantages over branching.
Did we decide on using the existing Jira and Wiki? I noticed that there are about 50 unresolved issues for sync - doesnāt necessarily mean we need a new project, we can just use the version fields to keep track of that.
If we are using the existing sync wiki, I can start putting together a doc based on the initial requirements and discussions on talk.
I donāt believe FHIR makes use of ATOM feeds anymore, rather they have moved to using a Bundle resource to list multiple resources along with metadata.
Typically a JIRA project is linked with one repo so not sure, if we should reuse the JIRA project, but not the repo, unless the idea is to retire the old repo.
Thanks @pgesek I was just wondering whether there was an update on this.
Comments about āAtomfeed for Sync 2.0ā:
If weāre going to refactor this, maybe we can move away from having a lot of global properties, and instead just have a single feed config that an implementation can specify with a json or yaml file
I like the idea of using an event-based hibernate interceptor instead of an advice class for every OpenMRS class
āChange the default urls to point to FHIR resourcesā => I would hope that Sync 2.0 can also be run with Bahmni, so I hope thereās a way that a single set of feeds can work for both master-slave sync, and Bahmniās MRS/ERP/LIS communication. Maybe we can introduce a new format where we provide multiple possible links to the event, e.g. both the FHIR one and the REST one.
Comments about āAuditing and error handlingā:
I wonder if this is going to be an excessive amount of data to be storing
It feels more natural to log this to a file (at least for successful sync) rather than to a DB table
PIH Rwanda found that for admin/management purposes itās really valuable to synchronize status data about each slave up to the master, so that some level of system debugging can be done centrally
āCatchment Strategiesā:
I guess we should get some feedback from potential implementations about what are the right strategies to focus on
āFHIR Resource List for Syncā
Naively I would think Drug => Medication (instead of Substance)
Some of these may be very imperfect matches, and after analysis we might end up wanting to do some of these via OpenMRS REST representations.
Thereās definitely a prioritization to be done here. E.g. some implementation might start using sync with just patient. Others would use it if you add visit, encounter, and obs. Etc.
(I will try to comment on the rest of the pages tomorrow, but Iāll post this now, in case I get delayed on the rest.)
we should not call āmaster-slaveā. IMO, it is not about replication - its about synchronizing relevant information between a server and its clients. Terminologies matter. So I would refrain from using the āmaster-slaveā and just use āserver and clientā everywhere. In our original paper, we had talked about how each client node can potentially be a master for its clients in the hierarchy. Essentially, each client decides what to process (push or pull) to the server.
I would advise to keep the FHIR resource handling as open as possible. Simply because no openmrs data storage (and blame obs hierarchical structure for it) is exactly the same. Same for medication order or statement.
While we may never be able to come to agreements about data structures across OpenMRS implementations and their mapping to FHIR resource universally, we should have series of discussions about the essential entities mapping mechanisms. e.g. How do we represent OpenMRS.Encounter to FHIR encounter? Does it make sense to construct this as FHIR composition? Agreement at broad level would help design the mapping and processing and also event ordering and processing.
In Bangladesh, we did all these for integrating with an HIE and using SimpleFeed as a protocol for event notification! It has served the purpose very well, but we needed to do a whole lotta work to ensure tackling problems (including ongoing support, monitoring) - many of which are relevant to out-of-order-event processing.
Why I say this repeatedly, I think that there can not be a singular subscribed mechanism for addressing such distributed system synchronization needs. All you can have is a broad framework that will probably deal with 80% of the cases, while allowing for extensibility for customization. If we fail to accommodate for the extensibility, then it will not serve general purpose. (it will serve specific purpose very well for sure)
Do you have some time aside in the coming days to finalize planning and/or any estimate when development could start?
Iām less available these days due to another project Iām involved in, but will help as much as I can. Let me know, if there are any gaps, which I could help you fill in.
Iāve tried to skim through everything, a few pieces of feedback:
+1 to using Hibernate Interceptors instead of AOP to figure out what to publish to the feed, as AOP is way to brittle. Maybe Iām forgetting, but is there a reason we wouldnāt be using the existing Event module for for this? @raff@darius
The feed mechanism should be robust enough to handle a slave being down on the order of days, or perhaps even months
Darius may be right that storing all the audit information to a DB table might be excessive, but from experience debugging Sync 1.0 issues, I do feel that having good admin tools for resolving issues, etc, will be critical, so Iād err on the side of whatever design you feel will allow these tools to be easiest to design, develop, and tweak.
I see @angshuonline point about āmaster/slaveā not being the correct terminology (https://en.wikipedia.org/wiki/Master/slave_(technology)). Not sure if I like client/server though as it might not be clear enough. Is there a reason not to use parent/child, as was used in the initial Sync module?
Yes, that makes sense and should be cleaner, Iāll update the feed doc.
An option is to use multiple link tags. Would be best imo to include the url configuration into this new atomfeed configuration file.
Yes, error handling is probably the biggest gripe users have with Sync 1.0, hence the the amount of logging. I agree that the success logs are probably not as important as the failure ones - we can log success messages to a file by default. Perhaps it would be best to allow users to configure which logger implementation to use for event types like success and failure. Sync could provide some default implementations like DB, File, No-op and implementers could inject their own if need be.
Iāve done some basic ordering of the list, Iāve also added a FHIR maturity level column.
On that page I listed the catchment strategies I got from implementers - feel free to add any additional ones.
Yes, my bad.
My hope is that on the sending/receiving end, we will have strategies that should allow controlling the mapping. We should also make it always possible for an implementation to inject their own client implementation for a given OpenMRS class (FHIR or not). It should also be possible to register additional providers on the receiving side in the FHIR module. Iāll start an extension section in the FHIR doc to gather information on extending how it works.
I donāt have a strong opinion on the terminology, I used what was in the project description. If Sync 1.0 uses parent/child then perhaps itās best to stick to it.
Thanks, Iāll talk with Jakub and come back to you on this. Just note Iāll be on vacation next week (18.09 - 22.09)
One other thing that occurred to meā¦ have there been thoughts on how to handle conflicts? Apologies if I missed the details in the documentationā¦
Currently Sync 1.0 doesnāt really have a strategy for resolving conflicts (beyond ālast in winsā)ā¦ it would be great if we could at least flag conflicts (perhaps based on dateCreated and dateChanged?). Conflict resolution can certainly become complex, so Iād understand keeping it simple for Phase 1, but if we could have something better than the current ālast in winsā (without any notification) that would be great.
(More comments, which I wrote over several days, so are probably chaotic)
āMetadata resource listā comments:
The word metadata is used incorrectly throughout these wiki pages, to mean āstuff that canāt be represented in FHIRā. A lot of our metadata canāt be represented in FHIR, so thereās overlap, but really the distinction is things that CAN vs CANNOT be suitably represented using FHIR.
Order is important clinical data that really needs to be using FHIR (e.g. DrugOrder => MedicationRequest)
GlobalProperty needs some mechanism to decide whether to sync things at the row level (i.e. some of these are system settings that you wouldnāt want to sync; Iām not sure if there are any GPs that you do want to sync)
I would sequence the work so that metadata sync happens later on. E.g. if an implementation can get all their metadata to be consistent through some other deployment process, they can adopt sync early, even without this function.
āSync 2.0 Architecture Overviewā comments:
This paragraph is confusing and I donāt understand it:
synchronization can be done in two directions ā¦ support both independently, as one-way sync is being used in the field currently by implementations adopting Sync 1.0. ā¦ All synchronization will be initiated by slaves, independent from the direction that data is transmitted
I think itās clearer to say that:
synchronization is always initiated by a client connecting to its server
once the connection is made, the client may pull data down, push data up, or both (depending on the configuration), to support multiple use cases
I agree with @angshuonlineās comment to avoid āmaster/slaveā terminology.
Since weāll support multi-level sync, this should be shown in the first diagram
āThe master will expose a complete feed of events - slaves will be in charge of filtering and pulling only the resources they are interested inā => consider individual feeds for different catchment areas?
āMetadata retrieved through the REST API will have to be inserted using conventional methods - best if it directly interfaces with OpenMRS repository classes to insert the metadata into the db.ā If weāre going to write a mechanism for turning OpenMRS REST responses into OpenMRS domain objects, we should do this in a reusable way, e.g. producing a REST client library as part of the REST module.
" it should be also possible to make the slave proceed through the feed even if a record fails to synchronize" => yes, but we also need a default behavior where a failed sync of a metadata item doesnāt lead to a huge cascade of failed syncs of data items that depend on it
I found the big architecture diagram to be confusing. Maybe it would be clearer if you explicitly separate the pull and push workflows
Why is ClientFacade called that? (Maybe this is more clearly described elsewhere, but Iām not sure how the Facade pattern is relevant.)
āSync 2.0 configuration variablesā comments:
I would default the āenabledā variables to false (since things wonāt actually work without some configuration)
Is the idea that changes would be captured and written to the atom feed always, regardless of whether push and/or pull are enabled? We might also want a setting like sync.enabled that controls this.
About error handling: are we going to preserve the downloaded REST/FHIR response to replay again? Or would we re-fetch it from its url?
The event module uses ActiveMQ, which has been occasionally buggy for us. (Maybe weāre just using a 5 year old version of it.) We should definitely consider either modernizing the event module, or writing a from-scratch replacement, that can be used for Sync 2.0 and other things too. But for Sync 2.0 the real requirements is to have something that goes from hibernate interceptor events to atom feed events. We could go via a message broker like in the event module, which could be more reusable, but also adds (unnecessary?) complexity.
Good point. So, yes, itās fair to prioritize logging. I wouldnāt over-complicate by giving too much configuration about which loggers to use. Just pick a good default, at least in the first pass.
I was also thinking about this. Personally I would prioritize āquicker to developā above āconflict resolutionā. (Ultimately, it would be nice to capture a complete change history in openmrs-core, or in a multi-purpose module, not just for sync.)
Even to do something simple like comparing based on dateCreated/dateChanged requires that we persist an extra piece of data on each update (i.e. what was its original created/changed timestamp) so we can do a 3-way comparison when merging. If this can be done while generating the atom feed, without adding lots of complexity, it could be worthwhile.
I wonder if we can leave a hook to add conflict resolution in at a later date, once a separate module has been built that captures a full change log.
Agree that āconflict resolutionā could be a bit of a beast and we shouldnāt get too bogged down with doing it perfectly. However I think we certainly should consider have at least some sort of conflict notification in Phase 1 or at least some sort of explicit warning to prevent people from doing ādangerousā things.
Sync 1.0 has no conflict resolution or warnings, and the general rule of thumb with Sync 1.0 is that āyou never should be editing the same patient on both the parent and the childā (or two children, for that matter)ā¦ but I find myself forgetting that rule a lot, no mind an everyday end user. Itās kind of scary because it can silently lead to data corruption (and I have no way of saying whether that has actually happened in our current implementations).
If we canāt use the event module for thisā¦what exactly is the purpose of the event module? If we think the implementation of the event module is buggy, then we should fix the implementation, right? It would be nice for us to deal with hibernate interceptor messiness once, and do it properly, in the event module, and then for us to rely on that for a way to publish / subscribe to changes.
Presumably, there could be an Event listener that we support which would simply hit the REST endpoint, get the JSON, and log this - maybe to an external document database, along with the other information from the event (event type, data type, uuid, etc). Then, implementations that want a full audit log and have the terabytes to spare could simply enable this. This would seem like it would be pretty low-effort to add in.