Idea: Tags in OCL (vs Attributes?)

grace · October 26, 2021, 3:25am

@michaelbontyes recently raised on Slack the idea of “tagging” concepts - and the more I think about it, the more I want this feature too! This is important and relevant for MSF, but increasingly relevant even for me as I try to build example content sets for sample clinical packages.

Scenario: Imagine you are an Implementer managing dictionaries and code sets for many different kinds of implementations around the world. You have different sets of code recommendations that vary by things like program area, type of site / site resources, by country, etc. There’s a lot of overlap between these different ways of categorizing your codes. So just having the linear Hierarchy of Organization, Source, and Collection doesn’t adequately help you slice-and-dice all the content you have to manage. Mappings don’t address this need because your goal here is not to link things together - it’s more that you want to categorize them, i.e. label related things.

Some practical examples:

Different teams may use these tags in different ways, but here are some ideas:

Right now every time MSF has a new program, or new site, or new country implemenation, they have to create a new collection. They would rather just go through all the concepts in the MSFOCP source and tag these, rather than having to re-create new dictionaries and sources for each new use case. Then the tags could be used through a search to rapidly set up the new dictionaries needed (e.g. “show me everything that’s tagged with maternity”).
“Forms warning”: MSF would like to tag all concepts used in Production forms with a “forms” tag, so that it’s plainly obvious which concepts are currently being used live. This could also help a content manager be aware to be especially careful changing these particular concepts. (@michaelbontyes do I have this quite right?)
“moh731”: Imagine tagging all the concepts needed for a specific Ministry of health report, such as the Kenya MoH 731 reports. These may be spread across different collections.
Tags for breaking down work you’ve done in a Dictionary: Right now I’m building sample Demo Dictionaries for a HIV and NCD demo packages. I want to put all the codes into a single collection for each of these, however, I wish I could filter the list by one category at a time, so that I can compare my Iniz .csv files to make sure I have all the concepts that I need for things like locations, programs, etc, which don’t automatically have a filter or concept type (nor should they) in the same way that Tests or Diagnoses do.

@michaelbontyes or @ball or @suruchi or @ibacher any other use cases on your mind we can add?

grace · October 26, 2021, 3:35am

3 Key Epics

Breaking this need down:

Per-concept metadata we could call a tag (minor lift? Would be nice to have some UI improvements, eg. colorful tags somewhere a bit prettier than the raw Attribute label)
Filtering based on that tag (big lift?)
(Nice to have) creating a collection of concepts based on tags (quite a big lift?) (so I can assemble a collection automatically based on a query, without having to click through the process manually)

(@michaelbontyes is this a fair example of the query you might have in mind to assemble such a collection of tagged concepts: “forms + bangladesh + maternity”?)

Existing Support for “Custom Attributes”

OCL already supports something called “Custom Attributes”, as shown in the screenshot below (Documentation here: Custom Attribute Filters — ocl-docs 0.0.1 documentation). However, while it’s fairly easy to add a Custom Attribute to a concept in OCL Online, there’s no UI to search based on Custom Attributes/tags at the moment.

grace · October 26, 2021, 3:39am

Next Steps:

We reckoned that this may need some support from the OCL backend team. @paynejd and @burke and Sunny what do you think of this feature idea?
@ibacher what would we need to do next in order to be able to add “Tags” filters in the concept sidebar? (Assuming we can use the existing OCL Attributes for the “Tags” purpose)

ssmusoke · October 26, 2021, 3:23pm

@grace I would suggest going the custom attributes way as tags are too free form - you can use custom attributes for more complex associations, plus they are supported across the entire OpenMRS model.

The custom attribute may be program_name for MSF, but you may find a need for some commonality across the programs that you may need to use too

ibacher · October 26, 2021, 3:32pm

I forgot about the ability to search by custom attributes! So, actually, what we have in place is probably enough to support tagging in the sense of one tag per concept. Some questions I don’t have answers to:

Can attributes in the extras fields store JSON Arrays? If so, will they be searchable in ES?

Ideally, we’d want to store a list of tags as a single attribute and be able to search against that attribute. As a fallback, we could use a range of “tag” attributes, e.g. “tag_1”, “tag_2”, etc. though we’d probably have to have some sort of limit around the number of tags per concept.

So, ideally we’d use a structure like this:

{
  "extras": {
    "tags": ["forms", "bangladesh", "mch"]
  }
}

With this being queried like:

/orgs/MSF/sources/MSFOCP/concepts/?extras.tags=forms&extras.tags=bangladesh&extras.tags=mch

But if necessary we could use something like:

{
  "extras": {
    "tag_1": "forms",
    "tag_2": "bangladesh",
    "tag_3": "mch"
  }
}

Which would have to be queried like:

/orgs/MSF/sources/MSFOCP/concepts/?extras.tag_1=forms+bangladesh+mch&extras.tag_2=forms+bangladesh+mch&extras.tag_3=forms+bangladesh+mch

(Note that this is definitely not equivalent to the first query… this will essentially include any concept which includes any of the tags “forms”, “bangladesh”, or “mch”; hence why this is less preferred).

akanter · October 26, 2021, 3:36pm

Just an FYI. There clearly needs to be the capability to have more than one attribute/tag per concept. Tags as free-form text quickly get unwieldy. People are moving more towards an ontological perspective. So links between concepts sharing a common attribute. Hierarchies and relationships between “tags”. There may be a maturity model here where we first enable custom attributes/tags to see what is being used, but then we need to be thinking of doing more to manage them in the future. It could get really messy quickly. Maintenance of the tags is also an issue. How to I go back and remove the tag when the concept is no longer used in the form? What if the groupings change. Better to use concept-concept relationships (relationship table for SNOMED, etc,) to try to group as much as possible rather than apply a static tag.

grace · October 26, 2021, 4:05pm

@akanter I can actually already add as many custom attribute tags as I want in the OCL Online UI, either to orgs or custom concepts:

I don’t think we need this in order to solve the problem, but I definitely see your point re. how the tag “maternity” or “HIV” could become confusing if orgs use them in different ways. However, then you could just filter your search by organization etc.

I can already delete or edit attributes in the OCL Online UI.

burke · October 26, 2021, 4:15pm

I think we’re confusing too different uses of the term “Custom Attributes”.

In OpenMRS, Custom Attributes are a specific design pattern designed to allow implementation-specific or module-specific extensions of the data model. If any core functionality of OpenMRS is going to use the attributes (e.g., we expect reporting, forms, concept searches, etc. to use concept tags), then they should be part of the core data model.
In OCL, they refer to their “extra” attribute (effectively JSON extensions of the OCL schema) as “Custom Attributes”. In many cases, we use OCL extras (aka “OCL Custom Attributes”) to store core functionality specific to OpenMRS (e.g., reference ranges, allow precise for numerics, sort weights, etc.).

Searchability will be important. I would guess they would be searchable or could be made searchable.

This is frequently used anti-pattern that usually crops up as “easy to implement” and then causes decades of complexity, headaches, and technical debt. If I add “forms”, “bangladesh”, and “mch” as “tag_1”, “tag_2”, and “tag_3”, respectively, then remove the “bangladesh” tag, does that move “mch” to “tag_2” or do I then have 3 tags with one null? How do I search for a tag without caring which one it is? What if I have 4 tags? And the resulting SQL… the horror!

The only occasion where I’ve felt comfortable with the _1, _2, _3 pattern was for addresses, where we specifically wanted to allow for finite number of additional fields and the order matters (i.e., they are basically 15 re-nameable address fields to accommodate the worldwide diversity in address handling). And I still try not to think about it.

So, please, let’s never do tag_1, tag_2, tag_3, etc.

Spoken like a true ontologist. But very important points.

I was expecting, given the use cases @grace describes, that we would add concept tags to OpenMRS and store these as extras within OCL. But, as @akanter points out, we need to be clear about are use case and the downstream effects. I could see a folksonomy of tags being useful for someone like @michaelbontyes that, when/if it becomes unwieldy or he needs to track changes, gets replaced with relationships (mappings).

We should be careful here. It’s true you can arbitrarily change custom attributes in OCL’s extras; however, this a free-form & unguided editing of the same extension mechanism we use for features like normal ranges, numeric precision, sort weight, etc. For example, if you make a allow_precise and assign it a value of true on a numeric concept, you will be making that numeric concept expect precise (decimals) values within OpenMRS. If you assigned a value of muwahahahahaha to allow_precise, then you could break your import into OpenMRS.

ibacher · October 26, 2021, 4:31pm

Well, this has the sole virtue of being completely implementable with no API changes , but yes, I agree that it is not what I’d want to implement. I think my first choice would actually be to have a first-class “tags” object in the OCL datamodel and avoid using the extras for this at all.

akanter · October 27, 2021, 12:26pm

I was referring less to the functionality of being able to create, assign and remove attributes, but more about the business processes which would be required to maintain these. It might be easy to create and assign them, but expecting that they are of high quality, accurate and timely is another story. I don’t think we want to make it to easy for people to throw on tags and then not be able to keep the current. That is why leveraging an ontology as Burke suggested makes sense, since these relationships can be updated automatically.

burke · November 24, 2021, 2:59pm

It would help to have specific use cases from implementations on how they would like to label concepts and for what purposes.

@michaelbontyes & @ball, could you give some examples of what you need for labelling or ad hoc organization of concepts? And are there things that you would just use in OCL and other things that you would want both in OCL and in OpenMRS?

ball · November 24, 2021, 3:01pm

FYI @mseaton @mogoodrich

michaelbontyes · November 24, 2021, 3:18pm

Hi @burke, here is what we need for MSF at the moment:

Data	Values	Current location	Need to search/filter in OCL	Needed in OMRS
Implementation	MW, BD, etc.	attr:implementation	Yes
Form	Vital signs, coloscopy, etc.	attr:form	Yes
Theme	Cancer, burn, trauma, NCD, etc.	Not implemented yet	Yes
PII	Boolean	Not implemented yet	Yes	Yes

Here is an example of how we use the attributes right now in the MSFOCP source: OCL

Thank you

@paynejd @grace

mseaton · November 24, 2021, 9:08pm

Right now, at PIH, the closest thing we do to tagging concepts is creating convenience sets and adding concepts to that set. And when that gets unwieldy, we use sets of sets.

Would it make sense to use “Concept” for this, with a specific class = “Tag”? Then, tagging Concepts is simply a matter of adding a Mapping just like with Set members / Concept Answers. There could be a new map type (eg. “TAGGED-AS”) to go along with “CONCEPT-SET”, “Q-AND-A”, etc.

The UI could be made to allow efficient assignment of tags to a Concept via autocomplete on all concepts of class = tag in a streamlined way.

ibacher · November 29, 2021, 2:06pm

This actually seems like a good way to go to me.

@michaelbontyes Marking concepts as PII is something we’d need some specs on. E.g., how is this stored in the OpenMRS concept dictionary?

michaelbontyes · November 29, 2021, 3:41pm

Hi @ibacher, in the context of GDPR/HIPAA/+ regulations compliancy, the idea behind “PII” is to be able to identify concepts potentially containing PIIs or sensitive data within implementations to facilitate security safeguards like data anonymization for backups, analytics, UATs, etc. It could be stored as a Concept Attribute Type in OpenMRS:

akanter · November 29, 2021, 4:39pm

@angshuonline just added some needs to group on the OMRS Implementers meeting unconference session. Perhaps he can clarify here.

angshuonline · November 30, 2021, 10:34am

I look at attributes as enhancing the metamodel - and they are not to be adhoc, but decided with good reasoning and documentation. Its like a verified extension to a FHIR resource, which has been documented with usecases somewhere. In Bahmni, we create a attribute “sellable” to indicate to ERP that a concept is a sellable service/product. Or for a practitioner attribute “available for appointment”.

Whereas, I look at tags, as less formal label - like Location tags - e.g. Admission Location. Removal of the tag does not have any adverse impact - although it is used to show what are admission locations in a hospital. The same location can also be a “Appointment Location”

My submission for unconf discussion was more to think. of standardizing querying and organizing concepts and metadata. In many cases, we follow a convention - set of sets, members of conv set - can we think of FHIR-ify them through establishing valuesets - that can be defined exclusion, inclusion in a given context … just like in V3 or in FHIR. For example, how do I validate that a CHW can only prescribe medications from a given set? How do I show for a pathology lab, for blood samples, these are the given set of tests/panels allowed? Similarly how can OMRS dictionary allow definition of concept maps - although I think this is a relatively easier problem to solve if concept has mapping with 2 different naming systems.

burke · November 30, 2021, 5:39pm

To @angshuonline’s point, along with understanding the problem(s) to be solved, having clarity on our tools, how they’re meant to be used, their strengths, and their weakness might help us choose the right tool(s). Using a single tool (like mappings) to solve everything gets me nervous… I’m sure the Windows Registry seemed like a good idea at one point in time.

Tool	Purpose	Strength	Weakness
Concept Set	Semantic grouping of concepts.	Many-to-many relationship.	Cumbersome when large (e.g., 50+ members).
Concept Class	Categorization of concepts.	Can be “hard-coded” to control core behavior.	Only one per concept.
Mapping	Relationship between concepts. Lookup code(s) for concept.	Finding a concept
Attributes	Extending the data model for custom needs (of implementation or module).	Strongly typed extension to data model.	Not appropriate for core needs (should just extend the data model). Must be defined before they can be used.
Tags	Ad-hoc “folksonomy” labeling of resources.	Easy to use ad-hoc labels for multiple needs (don’t have to be pre-defined).	Simple strings (not typed). Specific tags should not be hard-coded into core code.

Some of the implementation needs we’ve seen might be better served using concept attributes and others by tags or sets.

One option would be – for lack of concept tags in core – to create a concept attribute for tags (for now) and take the approach:

Small number of concepts → sets
Non-typed labeling → tags
Typed properties → concept attributes (either namespaces by implementers or we namespace our standard ones with “omrs-” or an underscore prefix)

angshuonline · December 1, 2021, 1:19pm

One comment, we have been forever thinking of supporting multiple classes for a concept.