Defining our concept validation rules for OCL

In our effort to realize the potential of Open Concept Lab (OCL) for the OpenMRS community, the OCL team has asked us to confirm the validation rules that should be used for concepts OCL provides a “custom validation schema” feature that allows the OCL API to perform extra validation checks on concepts. This discussion is to review and confirm the steps defined in OCL’s OpenMRS Concept Validator:

For any concept…

  1. Must not have more than one preferred name per locale
  2. All names (except short names) must be unique within the concept
  3. Must not have more than one short name per locale
  4. Short name must not be marked as locale preferred
  5. Only one fully specified name per locale
  6. At least one fully specified name (across all locales)
  7. Valid values for class, data type, name type, and locale

For a dictionary…

  1. Fully specified names must be unique across all fully specified names and preferred names (synonyms marked as preferred name) within a locale*
  2. Multiple concepts cannot share the same preferred name in the same locale
  3. All concepts should have a unique external_id (OpenMRS UUID) with length of 36 chars (per UUID specification)
  4. All concept names & descriptions should have a unique external_id (OpenMRS UUID) with length of 36 chars (per UUID specification)*

*Not yet implemented in OCL validation

@ball and any other concept dictionary owners, can you please review the rules above and confirm they meet your expectations for rules about OpenMRS concepts?

I’ve reviewed these closely and they seem to be valid from my perspective. I’d be happy to clarify any of the above rules if folks have questions.

1 Like

@ball, we should discuss this rule with @akanter. It makes sense that a truly fully specified name Would not be appropriate as a synonym for another concept; however, there are some fully specified names (even in CIEL) that may be useful synonyms in another context – e.g., Absent.

At least one thing that occurs to me is to ensure that these rules ignore (or mostly ignore) retired concepts.

1 Like

The FSN uniqueness is clear. FSN would rarely be a synonym for something else as the whole point of an FSN is that it uniquely describes the concept.

These rules makes sense. I’ve gone thru all the validation errors with import of the PIH dictionary into OCL and documented the problems and the fix. The concept numbers refer to this OCL for OpenMRS staging site

Of the problems, CIEL has the same conflict and should be updated for these:

I’m curious about “burn” which might be a valid synonym for that med.

FYI @akanter @burke

Not sure what the issue with the first pair is. Please clarify. Burn was retired for Nitrofurazone. Difficulty breathing concept was retired. Will be in next release.

I don’t see specific conflicts, but the “borderline leprosy” concept does contain synonyms for “borderline tuberculoid leprosy” (which is the second concept).

There are a number of CIEL concepts with fully specified names that are general terms. In some cases, these terms might be used as a synonym in another context, like “Large” or “First”… I think there were some specific examples in the list Ellen worked through.

@akanter If it’s not clear, “Lèpre borderline tuberculoide” (fr) is a synonym on Borderline Leprosy.

Sounds like you already found/cleaned up CIEL and these conflicts. Another strong reason for using OCL instead of forking concept dictionaries – like PIH has done in the past.

1 Like

While working on importing the PIH dictionary into OCL, we ran into an issue with our validation constraints:

For a dictionary…

  1. Fully specified names must be unique across all names (except short names and index terms) in a locale

The PIH import contains a retired concept 3779 with the name “Dextrose 5%”. It turns out this is also a name for concept 1866, which prevented 3779 from getting imported. Which brings us to this question:

Should the uniqueness constraint of fully specified names include retired concepts?

Imagine your implementation has been using its own concept for “Reason vaccination was not done” and then you discovered CIEL has its own Reason vaccination was not done, but your concept is different enough that you don’t feel you can just map it to CIEL’s concept. So, you decide to retire your concept and import CIEL’s concept. With the currently applied validation rules, you would be prevented from importing the CIEL concept until/unless you renamed your retired concept to something like “Reason for vaccination was not done (retired)”.

My initial reaction is that we should only consider non-retired concepts when applying the “unique fully-specified names” constraint. This way, you could import the CIEL concept and would only be forced to rename your concept if you decided to un-retire it later on.

@akanter @ball thoughts?

I believe this is the validation logic we actually enforce in OMRS, i.e., we don’t consider either retired names or names of retired concepts in reporting duplicate name exceptions.

I don’t think this is the best practice. Retired concepts should not be included in the dupe name checks.

I agree with Andy. “Retired concepts should not be included in the dupe name checks.”

Ok. So it sounds like we should fix this both in OCL and, per Ian’s comment above, within OpenMRS to ignore retired concept/names when validating uniqueness.

While doing this, we should ensure that the validation is properly checked/imposed when unretiring a name or concept and help the user at that point (e.g., when unretiring a concept/name that introduces a duplicate, let the user edit the violating name(s) during the unretire operation).

From the reactions I probably phrased my comment incorrectly. Retired concepts are not included in duplicate name checks in OMRS.

1 Like

Some additional validation requirements have come up recently. Specifically:

  • All concepts should have a UUID (currently OCL’s external ID)
  • Concept UUID should not exceed 38 characters in length
  • Concept UUID must be unique (ideally universally… but certainly within any source/collection)

Sound good? Have I missed any other validation rules?

Ideally, we’d also apply these UUID rules to the external_id fields of:

  • Concept names
  • Concept descriptions
  • Concept mappings

As we also leverage those UUIDs.

I’m happy with enforcing that where UUIDs exist they should be unique, but I don’t know that we need to enforce a rule that they must have UUIDs. We can always generate (consistent) UUIDs for them on import.

@akanter,

I’m working through applying the OpenMRS validation rules to CIEL and want to clarify one of the validation rules. A question came up on applying the validation rule: " Multiple concepts cannot share the same preferred name in the same locale".

Currently, OCL is enforcing this such that preferred names are unique across all names including (non-preferred) synonyms. I believe we want to allow for concepts to share the same synonym as long as multiple concepts don’t have the same synonym marked as preferred within the same locale.

The problem comes up for concepts in CIEL. For example:

While fully-specified names must be unique across all names, we want to allow duplicates across non-preferred synonyms, correct? Assuming so, I’ll ask the OCL team to refactor the validation rule to only enforce uniqueness of synonyms amongst those marked as preferred.

/cc @ball

I believe that we have a separate rule regarding “Fully specified” names. This is the default concept name in the locale which must be unique. This question relates to the “preferred in locale” flag. When a synonym is selected in the OpenMRS interface, the fully specified name is shown, correct? Eggs => Ova

This should be allowable I think. As long as the database has a uniquely defined FSN for the concept, synonyms do not need to be unique and so preferred synonyms probably don’t need to be unique.

The current OCL search does not seem to be searching synonyms correctly so it is hard for me to know what it would look like in the OCL browser.

Yes. I’ve kept the original post in this thread updated with the list of validation rules. #1 for a dictionary is FSN uniqueness.

I believe so. I don’t know if this is implemented universally, but it certainly could be.

To be clear, synonyms do not need to be unique; however, what I’m proposing is any synonym marked as “preferred in locale” must be unique amongst all preferred names within that locale. For example, two different concepts can have the synonym “Eggs” and either could mark this name as “Preferred in locale”; but, they couldn’t both mark “Eggs” as “Preferred in locale”.

Currently, OCL is forcing synonyms marked as “Preferred in locale” to be unique across all names (like we do for fully-specified names). I’ll ask them to relax this to only enforce uniqueness across names marked as “Preferred in locale” in the same locale. This means both “Eggs (edible)” (#162171) and “Ova” (#716) can legally have share the same synonym of “Eggs” in the same locale, since only the former has it marked as a preferred name.

1 Like

Hi Burke, has

this validation rule already been implemented?