OpenMRS Architecture Proposal: Modules and Bundles

core
Tags: #<Tag:0x00007fdb26820998>

(Daniel Kayiwa) #6

@mksd ping :slight_smile:

@bistenes do you think you can walk us through this proposal on a design call?


(Dimitri R) #7

Sorry for my late reply: @bistenes’s proposal deserves a lot of credit. Furthermore he has shared further details with me in private that I would like to attach to this thread for reference. I’ll try to do this after this post.

I had concerns about how Iniz would fit in all this, I realise now that it all hangs to the question: what should be stored in memory and what should be persisted? Iniz should remain and keep being extended for the latter, at the end of the day it is only a fairly dumb (meta)data loader, and it does the job.

I would be fully supportive of deprecating the scattered admin UIs by transitioning to having modules require their config upon “booting”. It is clear that this is an easy and big winner.

Then there will be the debate of what should become in-memory configuration that is currently stored in database, and how implementations will transition to the new state of things.

Anyway, of course we should have a design call about this.


(Brandon Istenes) #8

Hi Dimitri,

Thanks for the thoughtful response. That makes a lot of sense. I especially appreciate the note about CSV being useful for working with non-techs, that seems very salient.

I guess the tension between Iniz and my proposal illuminates a deeper tension in OpenMRS: that between DB-stored config (“metadata”) and in-memory config (“Settings”). Iniz is really good at the former, for things that have good reason to be DB-stored config, e.g. drugs. (I’m hoping OCL-for-OpenMRS takes over the Concepts space, and does so justly; I don’t think anyone has figured out Concepts yet). (Does AddressHierarchy use Iniz to pull in config from the App Data Dir? Seems like it should…)

But there are a lot of things that we presently store in-DB that really ought to be stored in-memory. So I guess at the heart of my proposal is a new in-memory store, a Config tree. Things that fit the bill for this include…

  • idgen
  • locations
  • personAttributeTypes
  • programs
  • encounters
  • forms/fields
  • logic rules
  • programs
  • providermanagement
  • relationships
  • roles

(Do you know how dashboards & patientSummaryWidgets are stored? I think these should also be in-memory, probably YAML-configurable…)

How have you been handling those things above that don’t have Readers in Iniz?

The idea of hierarchy and overriding is very interesting. I haven’t thought about it at all, really. I’ve been imagining that having inconsistent configurations should cause a fast failure. How will a client of Iniz define what is supposed to override what?

I imagine a declarative config-defining API that is, in theory, not format specific. It would define specifically the shape of the config tree. So, e.g., the code

import org.openmrs.config

// define a relationship config object
ConfigObject relationshipConf = new ConfigObject();
relationshipConf.addKey(required=True, auto=False)
relationshipConf.addStringField('aToB', runtimeMutable=True, required=True);
relationshipConf.addStringField('bToA', runtimeMutable=True, required=True);
relationshipConf.addStringField('description', runtimeMutable=True);
relationshipConf.addBooleanField('allowSelf', runtimeMutable=True, default=True);

ConfigEnum personTypes = new ConfigEnum<String>('patient', 'provider', 'nonProviderUser');
ConfigArray allowedPersonATypes = relationshipConf.addArray('allowedPersonATypes', personTypes);
ConfigArray allowedPersonBTypes = relationshipConf.addArray('allowedPersonBTypes', personTypes);

// tell config module to expect a top-level array of relationshipConfs
config.addArray('relationships', relationshipConf);

This might be satisfied by some YAML that looks like

relationships:
  - key: DOC_PT
    aToB: doctor
    bToA: patient 
    allowedPersonATypes:
      - provider
    allowedPersonBTypes:
      - patient
      - nonProvideruser
  - key: SIBLINGS
    aToB: sibling
    bToA: sibling

It could also be satisfied with a CSV that looks like

key,        aToB,    bToA,      allowedPersonATypes, allowedPersonBTypes
DOC_TO_PT,  doctor,  patient,   provider,            "patient; nonProviderUser"
SIBLINGS,   sibling, sibling,,

Let me know what you think of all that. I won’t be in Boston at that time; I’m based in Mexico. Looking forward to talking more about this though.


(Daniel Kayiwa) #9

For things that fit the bill, why would you want to store them in memory instead of DB?


(Mark Goodrich) #10

First off at a high level, this sounds great. Plus +1 for a design call on this…

@bistenes can you request a design call:

https://wiki.openmrs.org/display/RES/Design+Forum

To answer a few questions:

  • dashboards and patientSummaryWidgets are configured as apps and extensions (in memory):

https://wiki.openmrs.org/display/docs/App+Framework+Developer+Documentation

  • it would definitely be good to support a hierarchy and overriding… we have rudimentary support for this in our PIH Config and take advantage of it to define certain properties at the country-level shared across multiple instances, and then specific properties for different sites, staging vs production servers, etc; @bistenes, you are obviously already familiar with the PIH Config… worth keeping what works from than and jettisoning what doesn’t in a new design

  • Definitely interested more about the pluses and minuses of moving metadata that currently are first-order objects, like Locations, Programs, etc, from the DB to in-memory…

Take care, Mark


(Brandon Istenes) #11

Having information stored in two places (e.g., config files and a database) means that they can come out of sync. For metadata this is acceptable, since the database confers performance and scalability benefits over working with CSVs. For config, this is just one more thing that can go wrong. It also introduces the complexity of having to map large configuration spaces into flat K-V stores.


(Brandon Istenes) #12

My concern with hierarchy and overriding is that it would probably make configurations very hard to read and reason about. In any case, this is probably pretty far down the roadmap and we can think about it more later.

Thanks for the links @mogoodrich! I’ll go ahead and request a design call.


(Mike Seaton) #13

@bistenes, am I understanding that your proposal might be to move metadata out of the database? This would be a pretty radical change well beyond standardizing on an improved configuration mechanism, as the entire relational model is built upon foreign keys to metadata tables.

The approach we have used with metadatadeploy is that configuration is defined in code (or config files that can be interpreted by code), and the metadatadeploy module updates the metadata in the database to match it every time the system is started. I would assume we’d continue with this type of a model.

There remains some value in user interfaces that enable configuration of metadata, as these enable relatively quick addition and validation and are more user-friendly to a range of users than raw json or yaml. What we should make more explicit is that these interfaces are intended to be used on a development or test server, and there should be features added to export data from these into a compatible configuration-file format for sharing and version control. Where we have fallen short until now is that we have not provided a good model and set of tools for going from database configuration into configuration files that can be shared and version controlled, and so there are distributions in use in which the full configuration is only accessible on the running production server or backups of it. These administrative UIs could be moved into an entirely separate project. They could even be made agnostic of the underlying OpenMRS instance they are connecting to. This is basically what OCL is doing for Concepts.

I think there is still a conversation to be had as to where the code enabling configuration of various features should live. Should “iniz” be made “awareof” every module, and have the code built into it to configure each as needed? Should modules own their own configuration tools, and depend on “iniz” or implement certain interfaces, use certain annotations, or follow certain patterns to plug into a broader configuration mechanism?

Looking forward to more discussions and a design call around this.

Best, Mike


(Dimitri R) #14

I remember that you and I chatted about this back in 2017, and in the case of AH we settled on having it being standalone. It’s not a big deal (as everything works fine since AH 2.11.0 when this was released) but I kind of regret this decision. Having Iniz to be the one place has proven quite practical in the end. It’s sort of clear that this configuration folder is that one folder processed by Iniz. But again, the fact that the ‘addresshierarchy’ domain is handled by another module (AH then) is not much of a problem, it’s more a consistency issue.

Actually that’s remotely on my to do list as right now Iniz still requires other modules, which is wrong but hasn’t proven problematic enough in our use cases (because Iniz handles Core metadata for the most part, except for patient identifier types from the top of my head.)


(Brandon Istenes) #15

I want to draw a line between “metadata” and “configuration.” If there’s a lot of it (e.g. concepts, AH), then it probably makes sense to store in a DB, and it’s metadata. If there’s not, then there’s no good reason to keep it in a DB, and it’s probably awkward or restricting to represent in the relational model anyway.

Metadatadeploy is fine for PIH because we have a bunch of engineers. I think that OpenMRS more broadly relying on in-code configuration isn’t compatible with a vision of it being the platform of choice for low-resource settings.

The Admin config UIs you describe would indeed be useful. But there’s a lot of hedging around that. Making config elements runtimeMutable would satisfy the quick addition/validation needs. If there are configs that could really use to be generated from a GUI, it would be simpler to generate YAML directly than to find some mapping from in-memory config objects --> relational model --> config files. Implementation-agnostic metadata & config management tools sound like a great idea.

Something Dimitri mentioned in a private thread was that CSV-based configuration is useful for very-non-tech folks, who can then edit config in Excel. This seems like a good reason to have the YAML-CSV polymorphism. This would be a much easier way to allow configuration by non-tech folks than writing and maintaining Admin UIs, for config schemata that are sufficiently simple to admit a reasonable CSV representation.

I had been imagining modules depending on a config library and managing their own schemata. I could certainly be convinced that we should start by building configurability for other modules into Iniz. Perhaps we could roadmap how we then distribute responsibility for configuration back out to the modules themselves.

Actually, building config for other modules into Iniz would allow for config tree resolution and “verticals” prior to having core support for a “config phase” (c.f. my roadmap above). I like this.


(Mike Seaton) #16

Just to highlight this point, metadatadeploy does not necessarily require engineers, it’s just that the initial implementation of most of the deployment bundles have been Java-programming-heavy. There are examples of CSV sources and the module has examples of these in use. That said, I’m not trying to advocate for MetadataDeploy vs. Iniz (though one could argue that we should not have both of these independent initiatives that aim to solve similar things).

Mike


(Brandon Istenes) #17

A few asides before the main thing:

  • After the call yesterday and a good night’s sleep, I’ve become sympathetic to the idea that the config dir should be a subdir of the app data dir. The SDK setup step can create it and symlink it from the working directory into app data. Does that sound right?

  • I think the biggest technical hurdle in proliferating this is getting existing modules to use this new config mechanism. I don’t really understand Spring, but I think what would be necessary for the module to read config is to use a different implementation of the db/...DAO interfaces, which presently (mostly? universally?) use a db/Hibernate...DAO implementation, which is wired up with Spring. Can that be overridden from outside the module, or would we have to add these new implementations to the modules themselves? The new implementation would, for calls having to do with “config” rather than metadata or data, make calls against the config library’s in-memory config store, rather than against Hibernate. Does that sound like a good approach?

  • I might propose a config file tree like

config/
  |--HIV/                     # vertical
  |    |--htmlformentry/      # module
  |    |    |--intakeForm.xml
  |    |    `--config.yml
  |    `--laborderentry.yml
  |--htmlformentry.yml        # overrides
  |--laborderentry.yml
  `--production/              # overrides enabled at command line or by env var
      `--htmlformentry.yml

i.e. where at the top level in the config directory you have vertical packages containing module-namespaced files, module-namespaced files that override those in the verticals, and packages of module-namespaced files that will override both of the former two when enabled. Though maybe these three categories should each have a top-level directory, like

config/
  |--verticals/           # or "packages"
  |   `--hiv/
  |--main/                # or whatever
  |   `--htmlformentry.yml
  `--overrides/           # or "environments"
      `--production/

where order of precedence goes overrides -> main -> verticals. Verticals could be managed through the SDK.


(Brandon Istenes) #18

The main thing, which none of the above points are relevant to (I think):

We should figure out what next steps are (again, apologies for not directing the meeting more efficiently around this goal).

While this is very spiritually aligned with Iniz, I don’t think there’s much of the machinery that Iniz provides that will actually be useful in implementing this. I would like to build toward the API being in alignment with both Iniz and App Framework, so that migration is as seamless as possible for people using those.

So I think the right thing to do is to start a new module. I’d definitely need to talk to Mike and Mark about this, but… I would first build it to replace PIH’s own JSON configuration mechanism, which just configures the PIH module. This would be sort of a proof-of-concept.

Then I’d use it to configure some other module. Open to suggestions about what that should be. I propose Address Hierarchy, as an option.

There’s a lot of places we can go from there, but I think those would be good first steps.


(Darius Jazayeri) #19

Sorry for being very late to chime in, and sorry if this has already been covered. (And I have not actually read your architecture proposal, because I need to dash this off quickly.)

Two thoughts based on the talk thread:

  1. There’s huge value in having config be defined in text files in a separate folder, which can be version-controlled also. Bahmni does this already. Can we align with what they’re doing?

  2. “in-memory” has some advantages but it can be problematic to scale this horizontally to multiple web servers. Although nobody is doing this today, I think we should avoid any architecture design choices that will prevent us from scaling that way in the future. The main mitigation for this is to ensure that any in-memory config is just a cache of something more persistent and/or to ensure there’s a proper ConfigService API that would let us someday also run the config as a separate microservice, e.g. with multiple web servers talking to it.


(Brandon Istenes) #20

Oh interesting note about scaling in-memory across servers. I think Burke said something along these lines on the call but I didn’t fully understand him then. Two straightforward (I think) possibilities:

  1. the configuration directory gets shared across the servers somehow. Since configuration is read-only (with the exception of “mutable” properties, which should not be used in production), there’s no distributed-systems type problems that would need a database to solve

  2. have the config module load YAML strings into the database. This evades the significant technical pain of mapping the expressive space of YAML into SQL by accepting a little ugliness. The database is essentially treated as a dumb store for sharing data between servers.

I’m more in favor of the former, but not sure how I’d implement it (SDK tooling?), and figured I’d throw both ideas out there.


(Mark Goodrich) #21

Spent some time brainstorming about this and thinking about your points from yesterday @bistenes… makes me realize I probably need to spend an extended amount of time to really think things through and pull my thoughts together, but rather than taking longer than I’d like to get that done, here’s a laundry list of thoughts:

  • In your point that starts “I think the biggest technical hurdle in proliferating…” I don’t think that the Hibernate DAO’s really have anything to do with it? I’m thinking there’d likely be a new bean or beans created that would be wired into modules. Each module would have to be updated manually to use this new bean. Beyond the technical issue of trying to override things outside the module, each module tends to do configuration in a different way, part of this project would be about reworking each module to use a standard config format… tedious, but I don’t see a way around it.

  • There’s definitely metadata, like encounter types, visit types, concepts, etc, that will still need to be persisted in the database and that modules would still access via the existing services, and this new architecture would simply be responsible for making sure the appropriate types installed in the DB and/or configured properly

  • You mention using the SDK in a few places… currently at PIH we use the SDK only used to set up a development environment; we don’t use it at all to set up production servers. I believe it does have some features related to production, but I haven’t explored them (just glancing at them now, I’m interested in the docker features @mseaton). Definitely something to consider (the SDK has become a great tool) and I’d support expanding it, but just wanted to clarify since you mentioned the SDK in the context of a production deploy.

  • In perhaps a “file-under-unhelpful” comment, your different config tree examples makes me realize that there are many ways we could config things and nothing strikes me yet as a “best” option… yet it would be easy to come up with one that we look back on as “bad”

  • I’m still not in love with the term “verticals”, though if others like it and there’s no better suggestion, I’m fine with it. What we are really talking about it configuration “bundles”, right? Vertical seems too tied to actual clinical programs. For instance, I could see there being an “HIV” bundle and a “Haiti” bundle installed in the PIH EMR, but those two things are really perpendicular so they both can’t be verticals, right? Or am I getting bogged down in semantics? :slight_smile:

  • It might make sense to distinguish between configuration and content? For instance, in terms of configuration, the EMR-API module needs to be told what encounter type to be used for check-in, which provider to be used a the “unknown provider”, etc (see EmrApiProperties–these all used to be handled via Global Properties, but in EMR-API they’ve been switched to use Metadata Mappings… this migration is a work-in-progress, but I think we seriously want to consider focusing on using Metadata Mapping going forward); In terms of content, an HIV bundle may provide a new HIV intake form and associated concepts and encounter type

  • For configuration, using the EMR-API example, for a fully seamless deploy, EMR-API would need to be bundled, say, with a default encounter type to use for Check-In, a default provider to use as Unknown Provider, etc, and then via configuration the implementer could tell EMR-API not to install that metadata but use existing metadata

  • For content, one big challenge is how to share a form with all the concepts it requires. Metadata Sharing and Html Form Entry actually provide a technical way to do this that works pretty well (full disclosure, I wrote it 9 years ago :slight_smile: )… when you share a form via Metadata Sharing it bundles all the concepts it uses with it, but for various reasons this hasn’t been particularly useful for us–MDS packages are rather opaque as you have noted. Also, this can lead to real smorgsabord of concepts… ie, what if the HIV vertical and the TB vertical provide a different concept for “pregnant”? When importing a package you’d like to be able to either map to an existing concept in your dictionary or use the default one that the package/bundle/vertical provides. This is a bit of a nasty problem, and the solution probably revolves around some combination of trying to use a standard concept dictionary (CIEL) as much as possible, and using Concept Mappings (after I write this I realize that this is a similar problem as with configuration, but I still think there’s a potential distinction between the two)

  • Agree with your point about the “main thing”… we need to find a path that allows us to move forward iteratively (which my brain dump of thoughts here may not help in achieving). At a high level, I think your idea of starting with a new module makes sense, but interested in other’s thoughts. The new module could then use the Iniz module (or perhaps the Metadata Deploy module) as it’s method of installing metadata into the database.

That’s all for now… thanks for working on this!

Take care, Mark


(Darius Jazayeri) #22

Continuing my late replies with only partial context…

These are some far-reaching, interrelated, and delicate changes. So +100 to the idea of doing things incrementally.

Mark’s point that just the XML part of an HTML form is not necessarily enough, because the form itself requires metadata is important. This includes mainly concepts, but also encounter types, and perhaps other things. So a “vertical” may also need to package metadata along with config.

I also like “bundle” more than “vertical” as a name.

Mark already mentioned it, but the metadata mapping module took a stab at part of this, i.e. it allows the EMR API module to

+1 to distinguishing between content and config (even if they may be technically similar). Also +1 to looking at metadata mapping, which already does provide a way for the emrapi module to say “I need to be configured with an encounter type that’s used for check-ins”.


(Brandon Istenes) #23

Okay, so there’s definitely some difficulty around config vs metadata (vs content? I’m a little confused about what the line between the content and metadata is supposed to be, @mogoodrich ).

I’ve been envisioning three separate mechanisms. The config loader loads configuration into memory. Iniz loads non-concept metadata into the database. And something else (eventually OCL) loads concepts into the database.

I had been thinking that users would continue to be responsible for ensuring that all the metadata that their config refers to is present, though all references to metadata would be validated by the config loader.

But actually, it seems like it should be quite feasible for bundles to contain both config and Iniz-style non-concept metadata. And it would seem reasonable for Iniz and config loader to work together to validate that the config and metadata provided work together.

Mekom also uses Iniz for Concepts. Were we not headed toward an OCL world, I might advocate going sort of in that direction, which would allow us to easily bundle concepts as well.

This raises the question: does OCL-for-OpenMRS intersect with this problem? How do we imagine bundling concepts in some way that cooperates with OCL?

I need to go learn more about MDS and Metadata Mapping both.

What I can say about Metadata Mapping at this point is that, though I’ve never worked with it, it seems like a great idea and something that the config loader and Iniz should know how to handle, or maybe even adopt as the official way to name and refer to metadata.

“Bundle” is a fine name :slight_smile:


(Brandon Istenes) #24

Somewhat as an aside, I recently found out about HCL, which seems simply better than YAML, and now advocate using that instead. Hopefully that doesn’t seem wrong to anyone.


(Mark Goodrich) #25

Will follow up on the other points, but re: JSON vs YAML vs HCL, I’d err on the side of something that is more well-known and widely used, which seems to favor JSON, but, of course, that’s not the only factor. I haven’t heard of HCL before (which may not mean much :slight_smile: )… do we know how widely adopted it is?

Take care, Mark