Way back in 2010, OpenMRS switched boolean concepts from being numeric values to being coded values using the true and false concepts, set using the global properties concept.true and concept.false which store the database id of the concept for true and false. These are defined by default in core to be 1 and 2 respectively. This has usually been a non-issue, since by default we create the concepts 1 and 2as part of core.
Somewhere along the line (concept dictionary history is a little harder to track), we added two new concepts for “Yes” (CIEL:1065) and “No” (CIEL:1066) with “True” and “False” as synonyms. Also somewhere along the line, CIEL added two concepts: “Anemia due to blood loss” (CIEL:1) and “Anemia, hemolysis” (CIEL:2), which have the database ids of 1 and 2 in any distribution that uses the CIEL SQL files as a starting point.
The upshot of this is that we have, for a number years, been storing “false” observations as “Anemia, hemolysis” and true observations as “Anemia due to blood loss”, at least in distributions that have used the full CIEL SQL dump as a basis for their concept dictionary (this does not affect any version of the RefApp, AFAICT). This is sort of fine as long as the values are interpreted contextually in the backend (since the name attached to a concept is basically just some string value which the backend doesn’t care about), but it becomes problematic from the perspective of:
Having coded data in the database refer to the right concepts
Observations returned from the REST API, since Boolean concepts appear to be returned as a standard coded concept via the REST API.
Clearly, it would be great if we could not just resolve this issue, but do so in a way that builds towards what OpenMRS “should” do. I think this comes down to a few questions:
Should we (as a community) align on “special” concepts for boolean in values in core, essentially saying that concept 1 must be the “true” concept and concept 2 must be the false concept? (To function Core does need to have a concept defined for “true” and a concept defined for “false”).
Alternatively, should we align on using CIEL:1065 and CIEL:1066 for boolean concepts? (In this case, we probably need to add these CIEL concepts into Core, which will need some careful handling).
Should we avoid aligning as a community and instead provide a migration path for existing distributions to update their data to use CIEL concepts instead of 1 and 2?
Any other options I have missed.
@burke What points have I missed? I feel like I didn’t capture everything.
Thanks to the Palladium team, and especially @dkibet for discovering this issue.
Generally I think we should avoid trying to make assumptions about assignment of database primary keys in core if we can avoid it.
Using mappings is probably the right approach. I’m not sure about using the CIEL source, or whether it would be better to have some sort of “OpenMRS” terminology source/code/namespace, which could then be associated with these CIEL concepts.
That said, OpenMRS already allows these true/false concepts to be configurable, via the global property you indicated, though I agree GPs are not ideal. This problem could be solved simply by ensuring that the CIEL SQL files have the appropriate insert and/or update statements for setting these global properties appropriately based on what the actual concept ids are in that database.
It seems that a migration path is needed for any implementation that has this bug, regardless of what we decide to do in OpenMRS going forward, right?
Thanks for this topic @ibacher . I don’t have much to add.
An aside about booleans and OpenMRS concepts: For many reasons and for the past few years, Partners In Health has avoided booleans in our concept dictionary. They still exist. The best reason to avoid is that true/false might seen excellent when first used, but sooner or later, users will want the same question to capture “unknown” or “not applicable” or something else.
Yes. In general, it would be nice if core could make as few assumptions about metadata as possible. But given that we have a Boolean obs type, it’s probably reasonable to make some effort to ensure that concepts for this exist.
I actually hadn’t considered that. I was using CIEL:1065 since core itself (AFAIK) only installs these two concepts and CIEL:1065 is actually a concrete concept (I think, in point of fact CIEL itself inherits it from some version of the OpenMRS concept dictionary). If we were going to use a mapping OPENMRS:TRUE and OPENMRS:FALSE or something along those lines seems more sensible to me, since these are pieces of metadata core needs.
Yes. The question is really “where do we want to end up” so we direct these efforts in the the right place.
@ball That’s an excellent point and I agree that the boolean datatype is probably mostly redundant… which is probably partially why this hasn’t come up before.
This is precisely why we changed from using booleans under the hood and, instead, modeled boolean as a coded concept, making the transition from YES/NO to something like YES/NO/UNKNOWN much simpler.
If, instead of GPs pointing to internal IDs, we defined TRUE as “the concept mapped SAME-AS to OPENMRS:TRUE” (and similar for false), then…
we could ship with true & false concepts with these mappings and SAME-AS mappings to CIEL:1065 & CIEL:1066
anyone with an existing dictionary could add the OPENMRS:TRUE & OPENMRS:FALSE mappings to make things work
If CIEL concepts are imported to a system that already contains the out-of-the-box TRUE & FALSE concepts as the first two concepts, then CIEL’s TRUE & FALSE concepts should be skipped, because the SAME-AS mappings would already exist, right?
Not exactly. I committed a change in the last week, due to a bug we encountered in which existing concepts were not getting updated on an initial OCL import, that somewhat modifies this. Before, these OCL concepts would be skipped as duplicates. Now, if you have a concept in your existing system with mapping CIEL:1066, and you import a Concept from OCL that is identified as exactly {..."source": "CIEL", "id": "1066",...}, then the incoming import will match on (and update) your existing concept, whether or not the UUID and external id match.
So…you won’t end up with 2 concepts, but the OCL concept won’t be ignored as a “duplicate” either, but it will be considered the same concept and will update your dictionary accordingly.
Does this mean if you have TRUE and FALSE as concepts 1 and 2 with SAME-AS mapping to CIEL’s 1065 and 1066 and import the CIEL concepts, the CIEL concepts will overwrite your concepts 1 and 2? Will the UUIDs be changed to the CIEL UUIDs?
And would any concepts being imported that have answers (Q-AND-A mappings) to CIEL:1065 and CIEL:1066 get the appropriate answers (local concept 1 and 2) assigned?
I’m pretty sure (though would value someone verifying through tests) that:
Yes, it means the CIEL concepts will overwrite your local concepts if you import them directly (not just as references but the actual CIEL/1065 AND CIEL/1066 concepts.
No, the uuids will be unchanged
Yes, and this would work even if you didn’t import CIEL/1065 AND CIEL/1066 at all, as it just looks for the concept with those same-as mappings.
Ok. Discussed this again on today’s Platform Team call.
We think the way forward is to add mappings to OpenMRS:True and OpenMRS:False into our default dataset and then migrate toward using these mappings and deprecate the existing concept.true and concept.false global properties.
We also discussed a way to fix existing systems that might have coded concepts using concepts 1 & 2 for True and False, while concepts 1 & 2 are something else (not True and False). Initially, I thought it’d be nice to have a liquibase changeset to apply this fix automatically, but after creating Create a SQL query to fix systems where TRUE and FALSE are pointing to concepts 1 and 2, while concepts 1 and 2 are NOT actually TRUE and FALSE (TRUNK-6233), I think it’s probably better not to apply changesets that could alter clinical data “automatically” and, instead, we’d want to help implementations detect this condition and point them to a wiki page that describes how to fix their data.