Based in part on this conversation, we need a strategy & consensus on how we identify concepts within OCL and as we pull them into OpenMRS. Up until the PIH dictionary import, we’ve been using the OpenMRS Concept ID when importing into OCL and storing concept UUIDs as an “external identifier” within OCL.
I tried to list some of the key questions/problems we’re facing and potential solutions below. I would love to hear from others on opinions about these issues or other issues that I’ve overlooked.
For the sake of discussion, here are some definitions:
Term | Description |
---|---|
Concept ID |
Internal identifiers of concepts in OpenMRS. Have also been used as Code , but that gets tricky when using multiple servers, since these internal identifiers can vary across multiple servers. |
UUID |
True Version 4 (random) UUIDs allow creation of universally unique identifiers without a single authority. They are difficult for humans to use (directly editing form or report definitions with UUIDs is painful). |
Code |
The “official” identifier of a concept within a terminology or dictionary (sometimes called a “Gold” Concept ID) – i.e., the code you would use if mapping to the concept like LOINC’s 14682-9 , CIEL’s 790 , or PIH’s 790 . In OpenMRS, we have been using Concept ID as a Code ; however, this becomes harder when you grow beyond a single server. Some implementations (e.g., PIH) have used SAME-AS mappings to declare a Code for their concepts. |
The requirements
- Each concept in a dictionary within OCL needs a
Code
so it can be mapped or referenced (from a form, report, module, or from other dictionaries). - Implementers need to be able to clone or import content from other dictionaries into their own.
- Implementers need to be able to use CIEL concepts within their local dictionary.
How should we declare the Code
for concepts? Burke has long advocated for adding Code
directly to concepts (e.g., concept.code
in the database) as a unique Code
to be separated from Concept ID
(the database’s internal id).
Existing workarounds for managing an official Code
for concepts when implementations grow beyond a single server:
- Use a “Gold” OpenMRS server to manage the official dictionary, where the
Concept ID
in this server is consider the officialCode
. - Create a SAME-AS mapping to your own dictionary – e.g., a “Gold” mapping – to declare the
Code
for a concept by effectively mapping it to itself. - Maintain a list of official concepts separately (e.g., in a spreadsheet) and use custom scripting or the Initializer module to update concepts.
Proposal: Add concept.code
to our data model and refactor any code checking mappings to treat concept.code
as an implicit “gold” mapping (equivalent to having SAME-AS mapping to implementation’s dictionary). In the meantime, use gold mappings (i.e., only allow one SAME-AS mapping to the implementation’s dictionary and treat this as the “official” Code
for the concept).
If you import concepts from different sources into your dictionary, what should happen if two or more concepts being imported have the same code, since codes in a dictionary must be unique.
One could argue that concept IDs don’t matter and we can just use UUIDs reference concepts; however, UUIDs are not human-friendly. There’s a reason standard terminologies use human-friendly codes instead of UUIDs to identify their terms. When we first introduced UUIDs in OpenMRS, implementations tried using them everywhere and found mappings (using source + Code
) were far more preferable in situations where humans needed to work with them.
Proposal: if you are importing concepts with a conflicting Code
, you must provide a new, unique Code
for your dictionary before the concept can be imported.
What should the UUID be when a concept is cloned? For example, if PIH is using a CIEL concept with a few non-breaking changes, can they still refer to it via the same UUID or should it get a new UUID?
Technically speaking, all concepts should have universally unique UUIDs. In practice, when implementations use a CIEL concept, they have copied the concept into their system and may continue to refer to it by CIEL’s UUID even if they make some non-breaking adjustments to the concept locally.
Proposal: All concepts should have a unique UUID. If an implementation is using a CIEL concept, they can use its UUID; however, if they are going to make any change to the concept, then that altered concept (even if only non-breaking changes) should have its own UUID. Any references that want to refer to either the CIEL concept or a locally modified version of that concept should use a mapping (not UUID).
Do codes need to be numeric? Concept IDs are integers, but OCL doesn’t require a code to be a number and many terminologies (e.g., LOINC, ICD, etc.) use codes that aren’t just whole numbers.
In general, Code
does not need to be numeric. Both OCL IDs and OpenMRS mapping codes allow for non-numeric values. The only constraint is in cases where Concept ID
is being used as the Code
(which is the default practice for implementations with a single server).
Proposal: Add concept.code
to our data model and refactor any code checking mappings to treat concept.code
as an implicit “gold” mapping (equivalent to having SAME-AS mapping to implementation’s dictionary). In the meantime, use gold mappings (i.e., only allow one SAME-AS mapping to the implementation’s dictionary and treat this as the “official” Code
for the concept).