Talked with @jamlung about this issue this week; we’re hoping Joe + @paynejd + @burke + @sny can identify a solution in their usual meeting together this Friday. Here’s the details:
There’s an error, which is usually a blocker, that folks like @michaelbontyes and myself keep running into: Coded-type concepts’ children often don’t successfully import from OMRS into OCL via CSV Bulk Import, and often don’t successfully import from OCL into OpenMRS. Michael and I identified the cause today: The External ID requirement in the Name field. When this is blank, problems arise, as described below:
1) OCL Workflow: From OCL to OpenMRS
Steps:
Have a Coded-type concept in your collection, with children (e.g. a Question concept with Answers mapped)
Try subscribing to your collection
See the following error message: "Cannot save mapping XXX [CAUSE]: Column ‘uuid’ cannot be null"
Tested and reproduced with: Question, Coded. I need to test with the other Set-types like LabSet, MedSet, and ConvSet.
Mappings with external IDs bulk imported through CSV work, and also bring answers in coded questions in OpenMRS, so long as you add a UUID for the question names. So @michaelbontyes will use that work around until generation through the UI is solved.
Solution Idea
Burke’s previous post on a related topic about external id’s (quoted below) makes me think we do want to continue enforcing non-null values of external_id. If this is the case: Could OCL auto-generate a UUID for names, so we stop accidentally running into this blank name ID problem?
This is also solvable within the OpenMRS OCL module, which would have the benefit of working for older, currently-broken dictionaries. Although, that’s less useful if a UUID is later added to the name in OCL, since it may result in duplicate names.
A feature was added to OCL over the last year or so to enable auto-generation of both identifiers and external ids using either a “sequential” or “uuid” scheme. From what I recall, this is able to be configured on both Concepts and on Mappings (eg. set members, answers, reference terms).
It would seem that it would make sense to support the same thing for names - either by applying the same scheme chosen for the Concept “AutoID Concept External ID” setting or by having an explicit “AutoID Concept Name ID” and “AutoID Concept Name External ID” setting.
There is one more scheme that is used by Initializer in cases where it needs to identify whether a concept name has changed vs. when a different concept name is added, which is to have a pseudo-uuid (a uuid that is generated predictably based on the combination of name, locale, and name-type, so if these 3 match between 2 names within a concept then the uuid will match). Not sure if we want to support that here as an option or not… Code for that can be seen here. @mksd FYI.
I think we have finally managed to “identify” the problem.
For this discussion, it’s important to understand use of the terms “UUID” and “External ID”:
UUID: Universally Unique Identifier. OpenMRS uses UUID to uniquely identify resources. Ideally, these would all be valid Version 4 Universally Unique Identifiers; however, for historic reasons, there’s a lot of CIEL content that uses non-standard UUIDs.
External ID: A property OCL uses to store an externally-managed identifier.
Since OCL doesn’t (currently) have a “UUID” property that can be externally managed, we are using OCL’s External ID for storing OpenMRS UUIDs.
It appears that novel (new) content created in OCL is not getting auto-assigned external IDs.
Could you try manually populating these external ID fields for content you’ve created within OCL within UUIDs. You can generate Version 4 UUIDs for this purpose using www.uuidgenerator.net and, once you’ve assigned UUIDs to some/all of them, confirm that it solves your problem?
Assuming this solves you’re problem, there will be two phases to a solution:
Short term: we can introduced auto-assignment of external ID on OCL for concept names as we have done for mappings.
Long term: The OCL team is considering introducing support for a true “UUID” property for OCL content (one that would respect existing UUIDs and auto-generate a Version 4 UUID for new content), so we wouldn’t need to use “External ID” for OpenMRS UUIDs and issues like this would be resolved.
One thing brought up on the squad call was whether names should have standard external IDs (not just a UUID). It is possible that that UUID could replace the need for an external ID, but I draw people’s attention to the OpenMRS, SNOMED and UMLS strategies for dealing with names of concepts.
The use case is that I want to be sure that the name I select for the concept is the name which is shown to me when I look at the instance data again. For example, I entered Lou Gehrig’s disease. It is one of the names on the concept Amyotrophic Lateral Sclerosis. When I reopen the diagnosis list, I want to see Lou Gehrig’s disease, not ALS.
Also, there is a rarer use case which is sometimes names get put on a concept and then need to be moved to another concept. Instance data captured in OpenMRS collects both the concept_name_id and the concept_id. We mostly use concept_id, but there is code for the name.
I don’t think OCL handles external IDs for names. SNOMED has description ID for the name in addition to the concept ID for the concept. UMLS has three IDs (probably overkill). Concepts (CUI), Lexical (LUI) and string (SUI). I believe LUI does not care about character sets, casing, etc. But SUI is like an ascii hash which includes all the formatting.
I think we need to capture the name ID (and the definition ID as Ellen mentioned), particularly if we are going to host SNOMED.
We already have such a scheme in the OCL module. The key component to interoperating with OCL, I think, is ensuring that we appropriately use OCL’s URL scheme to generate the predictable UUID, which is what we do for sources, map types, concepts and mappings that we encounter without a defined external_id.
There’s a modest rationale for those: they all have unique identifiers from OCL to ensure we’re operating on the same concept. For example, to pick on a non-CIEL concept which would have this issue, MSF defines (in OCL) a concept for a “Peer Counselor”, which has the internal OCL ID of orgs/MSF/sources/MSF/concepts/999/. Since this concept has no UUID assigned, it would have the generated UUID of 4f9a4b44-3d3b-5702-b6c9-57f954d9e34e (an advantage of the version of the generator we have in the OCL module vs the one in Iniz is that the OCL one actually generates a UUID according to RFC 4122, whereas the version generated by Iniz are not exactly spec compliant).
Names and descriptions in OCL are sub-properties of a concept and OCL assigns them in internal UID, so we could use something like the scheme orgs/MSF/sources/MSF/concepts/999/name/6793451 for the FSN of the “Peer Counselor” concept in English, assuming that OCL’s UID for that concept remains stable across versions. That would give us a stable way of referring to the “same” concept name across different versions.
This is what I meant by saying that this is solvable via the OCL module.
The Concept External ID is the UUID that goes on the concept, not on the name itself. That’s a separate thing, so unfortunately we haven’t implemented anything quite yet.
I did want to note that we are having this discussion in OCL community sessions to do a holistic approach and tackle this the “right” way that incorporates this better into OCL’s data model.
It still feels to me like a super quick UI change (basically what Grace was thinking) would at least provide a temporary fix on the OCL that shouldn’t conflict with the “right” approach that we’re taking in the future. We could just add a way for UUIDs to automatically populate on concept names in the External ID field, which could be selected when creating/editing a source. We might just need a way to fix existing concepts, since this only helps with newly created/edited concepts. Here’s an image of what I’m thinking: