At least one thing that occurs to me is to ensure that these rules ignore (or mostly ignore) retired concepts.
The FSN uniqueness is clear. FSN would rarely be a synonym for something else as the whole point of an FSN is that it uniquely describes the concept.
These rules makes sense. I’ve gone thru all the validation errors with import of the PIH dictionary into OCL and documented the problems and the fix. The concept numbers refer to this OCL for OpenMRS staging site
Of the problems, CIEL has the same conflict and should be updated for these:
- Borderline leprosy (CIEL:147056) VS Borderline tuberculoid leprosy (CIEL:155319) ** It appears this was cleaned in OCL but the source should be cleaned (if it’s not)
- Nitrofurazone (CIEL:80701) vs Burn (CIEL:116543)
- Dyspnea (CIEL:122496) vs Difficulty breathing (CIEL:142373)
I’m curious about “burn” which might be a valid synonym for that med.
Not sure what the issue with the first pair is. Please clarify. Burn was retired for Nitrofurazone. Difficulty breathing concept was retired. Will be in next release.
I don’t see specific conflicts, but the “borderline leprosy” concept does contain synonyms for “borderline tuberculoid leprosy” (which is the second concept).
There are a number of CIEL concepts with fully specified names that are general terms. In some cases, these terms might be used as a synonym in another context, like “Large” or “First”… I think there were some specific examples in the list Ellen worked through.
@akanter If it’s not clear, “Lèpre borderline tuberculoide” (fr) is a synonym on Borderline Leprosy.
Sounds like you already found/cleaned up CIEL and these conflicts. Another strong reason for using OCL instead of forking concept dictionaries – like PIH has done in the past.
While working on importing the PIH dictionary into OCL, we ran into an issue with our validation constraints:
For a dictionary…
- Fully specified names must be unique across all names (except short names and index terms) in a locale
The PIH import contains a retired concept 3779 with the name “Dextrose 5%”. It turns out this is also a name for concept 1866, which prevented 3779 from getting imported. Which brings us to this question:
Should the uniqueness constraint of fully specified names include retired concepts?
Imagine your implementation has been using its own concept for “Reason vaccination was not done” and then you discovered CIEL has its own Reason vaccination was not done, but your concept is different enough that you don’t feel you can just map it to CIEL’s concept. So, you decide to retire your concept and import CIEL’s concept. With the currently applied validation rules, you would be prevented from importing the CIEL concept until/unless you renamed your retired concept to something like “Reason for vaccination was not done (retired)”.
My initial reaction is that we should only consider non-retired concepts when applying the “unique fully-specified names” constraint. This way, you could import the CIEL concept and would only be forced to rename your concept if you decided to un-retire it later on.
I believe this is the validation logic we actually enforce in OMRS, i.e., we don’t consider either retired names or names of retired concepts in reporting duplicate name exceptions.
I don’t think this is the best practice. Retired concepts should not be included in the dupe name checks.
I agree with Andy. “Retired concepts should not be included in the dupe name checks.”
Ok. So it sounds like we should fix this both in OCL and, per Ian’s comment above, within OpenMRS to ignore retired concept/names when validating uniqueness.
While doing this, we should ensure that the validation is properly checked/imposed when unretiring a name or concept and help the user at that point (e.g., when unretiring a concept/name that introduces a duplicate, let the user edit the violating name(s) during the unretire operation).
From the reactions I probably phrased my comment incorrectly. Retired concepts are not included in duplicate name checks in OMRS.
Some additional validation requirements have come up recently. Specifically:
- All concepts should have a UUID (currently OCL’s external ID)
- Concept UUID should not exceed 38 characters in length
- Concept UUID must be unique (ideally universally… but certainly within any source/collection)
Sound good? Have I missed any other validation rules?
Ideally, we’d also apply these UUID rules to the external_id
fields of:
- Concept names
- Concept descriptions
- Concept mappings
As we also leverage those UUIDs.
I’m happy with enforcing that where UUIDs exist they should be unique, but I don’t know that we need to enforce a rule that they must have UUIDs. We can always generate (consistent) UUIDs for them on import.
I’m working through applying the OpenMRS validation rules to CIEL and want to clarify one of the validation rules. A question came up on applying the validation rule: " Multiple concepts cannot share the same preferred name in the same locale".
Currently, OCL is enforcing this such that preferred names are unique across all names including (non-preferred) synonyms. I believe we want to allow for concepts to share the same synonym as long as multiple concepts don’t have the same synonym marked as preferred within the same locale.
The problem comes up for concepts in CIEL. For example:
- “Eggs” (#716) as in ova
- “Eggs” (#162171) as in from a chicken (marked as preferred name)
While fully-specified names must be unique across all names, we want to allow duplicates across non-preferred synonyms, correct? Assuming so, I’ll ask the OCL team to refactor the validation rule to only enforce uniqueness of synonyms amongst those marked as preferred.
/cc @ball
I believe that we have a separate rule regarding “Fully specified” names. This is the default concept name in the locale which must be unique. This question relates to the “preferred in locale” flag. When a synonym is selected in the OpenMRS interface, the fully specified name is shown, correct? Eggs => Ova
This should be allowable I think. As long as the database has a uniquely defined FSN for the concept, synonyms do not need to be unique and so preferred synonyms probably don’t need to be unique.
The current OCL search does not seem to be searching synonyms correctly so it is hard for me to know what it would look like in the OCL browser.
Yes. I’ve kept the original post in this thread updated with the list of validation rules. #1 for a dictionary is FSN uniqueness.
I believe so. I don’t know if this is implemented universally, but it certainly could be.
To be clear, synonyms do not need to be unique; however, what I’m proposing is any synonym marked as “preferred in locale” must be unique amongst all preferred names within that locale. For example, two different concepts can have the synonym “Eggs” and either could mark this name as “Preferred in locale”; but, they couldn’t both mark “Eggs” as “Preferred in locale”.
Currently, OCL is forcing synonyms marked as “Preferred in locale” to be unique across all names (like we do for fully-specified names). I’ll ask them to relax this to only enforce uniqueness across names marked as “Preferred in locale” in the same locale. This means both “Eggs (edible)” (#162171) and “Ova” (#716) can legally have share the same synonym of “Eggs” in the same locale, since only the former has it marked as a preferred name.
Hi Burke, has
this validation rule already been implemented?
@suruchi, from what I can see, OCL currently only validates that external_id
is 36 characters in length or less. I created OCL issue #1340 to improve external_id
validation.
FYI – I created a series of SQL queries for @akanter to check OpenMRS concept validation rules on the CIEL dictionary. These may be useful for others who are contemplating migrating their dictionary into OCL.