Found concepts with duplicate names in latest CIEL release

Hi,

I tried to import some concepts from the latest release of CIEL via metadatasharing module and it reported the concepts with the uuids below to have duplicates in Vietnamese

116506AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

121555AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

@akanter we’re working on a case reporting module that requires one of these concepts, we need to create an MDS package to distribute with the module and we have a deadline to have the everything ready in a couple of weeks, any chance you can have a new version of CIEL out with this sorted out?

1 Like

Also, @raff, is there any work we can do on the tooling that will prevent this from continuing to occasionally happen?

We will have a release end of this month

I guess we are back to the same problem . We did not make any changes to the vietnamese locale and these names have not been throwing any errors … With increasing locales i wonder if we have to test duplicate names for each locale ?

Cant think why this error is coming up now and didnt in the past

PS: We will most likely void both terms as we have no way of determining what is the correct term

Are you having any other errors?

Judy

Thanks Judy! That would be nice if you guys can have a new release out soon with the fix.

@darius, it’s still on my plate. It didn’t end up being that straightforward to setup. Updating the db automatically was easy, but starting up validation and notifying that it failed was still a manual process.

We’ll be doing a few tweaks to make it possible. We’ll be setting a CI plan for loading a new version of CIEL and running a test, which runs validation on all concepts and terms from the release. If it breaks, we’ll be notified by CI.

1 Like

I actually just wanted to have a set of SQL which I could run against the 1.6.6 database checking for validation errors which would occur when it was migrated. I suppose I could just run the first migration to 1.11+ and run the validation against that. I don’t want to have to do an entire build to figure out that it is broken. I also do not want to have to start up OpenMRS to do the validation. I should be able to run them against the database directly from any version…

Also, these are NOT duplicates. The neuropathy concept is: Bệnh thận and the Anthrax concept is: Bệnh than

The validation HAS to account for different characters with different encoding. Is it possible that @wyclif imported into a non UTF-8 database and these differences were flattened?

Thanks @akanter, yes I was using a UTF-8 database and I guess the name got flattened, I guess the validation needs to take care of that. Out of curiosity, assuming there is anybody here that understands Vietnamese, It would be nice to know if one wasn’t just miss spelt, I guess what am asking is if those words really mean completely 2 different things?

They certainly do mean different things and it is quite possible in tonal languages to have something spelled similarly to mean completely different things… but I was struck by the Anthrax and Neuropathy being so different as well.

1 Like