I believe that behavior in the OpenMRS 2.x Reference Application follows some pretty simple weighting that PIH wrote in the Mirebalais era. IIRC it was tested/tuned by devs, and not really informed by end-user testing, but this hasn’t been a big complaint, so maybe it’s decent? But this has not been tested/tuned against full CIEL, so in that sense it’s not good enough for OCL.
Taking a step back, things that seem important to me (but I am not a real user, and I haven’t managed concepts in years…)
- if they type a number, make sure that matching ids or mapping codes appear high up
- if there’s an exact name match on name/synonym, this needs to be the top result
- name match should be weighted much higher than description matching or full-text matching
- all things being equal, show results with a shorter preferred name vs a longer one (because CIEL is full of concepts that are too specific for the typical OpenMRS use case, but in the current sorting these concepts often show up first)
Here’s an example that’s particularly annoying to get right, given what the CIEL data looks like. Say I’m trying to find 70116 - Acetaminophen
https://openconceptlab.org/search/?q=type+2+diabetes => concept I’m looking for is result #17, even though it’s an exact match on a synonym. => exact match on a synonym should be above partial match on locale-preferred name
https://openconceptlab.org/search/?q=pulm+edem => zero results, but should find pulmonary edema => need to support partial/wildcard/fuzzy matches on all words.
https://openconceptlab.org/search/?q=pulmonary+ede => result I’m looking for is #6. As a heuristic, I think shorter names should appear before longer names, e.g. “pulmonary edema” should normally appear before “Postoperative Pulmonary Edema” or “pulmonary edema due to …”.
https://openconceptlab.org/search/?conceptClass=Drug&q=aspirin => lots of acetaminophen results show up because they have “aspirin free” in their synonyms => therefore, prefer matches on preferred name over matches on synonym. (this example may be impossible to get quite right because we can’t programmatically know if a synonym has a “not” meaning without adding a lot more business logic, and we usually do want to show matches)