Duplicate Concept names in CIEL

We are in the process of testing the import of the CIEL dictionary from OCL into OpenMRS and we see a lot of validation errors like:

Failed with 'Cannot save concept with UUID 28AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA' caused by: org.openmrs.api.DuplicateConceptNameException: 'hepatitis viral tipo C' is a duplicate name in locale 'es'
	at org.openmrs.validator.ConceptValidator.validate(ConceptValidator.java:190)
	at org.openmrs.api.impl.ConceptServiceImpl.saveConcept(ConceptServiceImpl.java:296)
	at sun.reflect.GeneratedMethodAccessor1046.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

The number of invalid concepts goes above 1000.

The concept 28 has a synonym in Spanish set to ‘hepatitis viral tipo C’. There is another concept 117593, which has a fully specified name ‘hepatitis viral tipo C’ in Spanish. Our ConceptValidator does not allow to have two same names in the dictionary in the same locale except for short names.

@akanter thinks that the validation is too strict. The concept 28 does not have any preferred name, the synonym is being used and the validation should be only on stored preferred names within the same locale.

However, the proposed change in the validator wouldn’t work either, because if you have a concept that has just one name in a locale, it automatically becomes preferred name for that local except for short names and index terms.

Since correcting more than 1000 concepts to pass the current validation rules seems like a long process… I would initially either:

  1. Change synonyms to short names for similar cases in CIEL, so that they pass validation.
  2. Make our validator less strict.
1 Like

This has me a bit confused, because many implementations are already running CIEL on Platform 1.9+, which would presumably have this same validation issue. Is this something that you expect is already an ongoing problem for those implementations, or is this somehow a new problem?

This is an issue for implementation already running CIEL on Platform 1.9+. However, it does not show up until you try to edit some invalid CIEL concept, which is not something you would normally do. You would also see the issue if you tried to export/import concepts using the Metadata Sharing module.

CIEL imported using SQL script is not being validated, thus we haven’t caught the issue earlier.

It’s not causing any issues when using concepts in forms.

I think it would be great if we could get some Spanish translation help on these concepts so that we could provide a preferred name that is unique and obviate this problem. We can provide those fixes over time, although 1000 of them might take a few releases to get to. I don’t think it would be a good idea to change them all to short descriptions, but I would be interested in knowing how when this is fixed it would then propagate in the updated versions.

@akanter, what if we changed them to be short names temporarily until someone can provide a preferred name in Spanish that is unique? It is important to come up with a quick fix so that we can import the full CIEL dictionary into OpenMRS.

I could come up with a SQL script that does the change in the CIEL dictionary for you to run it.

Alternately, I could make the OCL module convert on the fly synonym names to short names when validation fails due to duplicate concept name exception or simply skip importing those synonyms at all.

Making the validator less strict does seem like a less desired solution even short term. Also we could not roll out that change quickly as it is a change in openmrs core.

I would prefer if the module flipped the flag for now. We would need to make sure we know what happens when we flip them back.

1 Like

Okay, I’ll do that! Thanks Andy!

landing into this while exporting a bunch of html forms http://pastebin.com/SV4GkySh

@judy Some of these do appear to be new duplicate issues. Do you want to review for the next release?

Will do that

1 Like

@arbaughj, i think iSantePlus needs to review these duplicates and make contribution to the next CIEL dictionary release

Thanks @k_joseph. I will plan to work on this with @judy and the CIEL team the beginning of next week.

Thanks! We are already working on resolving these.

The following corrections can be made to resolve duplicate concept names in HT, FR and EN…

Concept Number - Lang - New Name 119270 - HT - “maladi kadyo-vaskilè” 5016 - HT - “myokadyopati” 139071 - HT - “maladi kè” 159 - HT - “pasyan mouri” (Consider merging CIEL:159 and CIEL:160034) 163746 - HT - “Rezilta pou egzamen rektal” (Merge CIEL:163746 and CIEL:163582) 163532 - HT - “Manman pasyan antre nan pwogram prevansyon transmisyon manman-timoun (PMTCT)” (Merge CIEL:163532 and CIEL:163776) 141584 - FR - “symptômes maux d’oreilles” 73193 - FR - “vaccin contre la varicelle” 73193 - HT - “vaksen varisèl” 123074 - FR - “vision anormale” 159906 - FR - “nombre de la fratrie testés pour le VIH” 159906 - HT - “kantite frè ak sè ki te fè tès VIH” 1724 - FR - “Trimestre de la première visite prénatale”

Merge CIEL:111061 and CIEL:123468 - they both have the same description. Merge CIEL:5490 and CIEL:163320 - they both have the same name and no description. Merge CIEL:136938 and CIEL:163886 - they both have the same name and no description.

Note: Color related concepts don’t appear to have duplicates anymore. They must have been corrected already.

Note: Someone else will need to help with locale ‘vi’.

I made the following changes with AK> comments where I didn’t.