Metadata Mapping Module development progress

raff · December 21, 2015, 10:43am

@kosmik, will you be able to join us? Please see https://wiki.openmrs.org/display/RES/Design+Forum for connection details.

kosmik · December 21, 2015, 11:55am

Yes @raff, I will be joining. Seems like I just forgot to tell anyone.

raff · December 22, 2015, 10:05am

Thanks for joining the call. The key takeaways are:

Fix MetadataTerm.metadataClass and MetadataTerm.metadataUuid to be optional in order to be able to create a mapping, which does not yet point to any metadata in the system and can be configured at some later point. MAP-13
MetadataSet will not be included in the first version of a module as there is no agreement on the approach to take.

kosmik · December 23, 2015, 3:04pm

Thanks for the summary, @raff.

I will be going away on my midwinter holiday now and will get back to work on this module after a week or two. I will then concentrate on finishing the few minor issues that we still have (not including MetadataSet).

Any ideas on how will we proceed with working on the MetadataSet feature set (or something else of that nature if MetadataSet is deemed invalid)? Personally, I have noticed that I have problems grasping the functional requirements side of things here as I don’t have the required experience with OpenMRS. Thus I would be very happy to concentrate on the technical side/programming and leave the “why” to others.

darius · December 23, 2015, 4:24pm

I’m not sure this is 100% finalized (so maybe it’s convenient that @kosmik will be off for a week or two).

There is a key difference between MetadataSource/MetadataTerm in this module vs. ConceptSource/ConceptReferenceTerm in core. For concepts in core the term and the mapping are separate things (term = source + code; mapping = concept + term), whereas in this module, the term and the mapping are combined together (term = source + code + ref-to-metadata).

The implication of this is that a MetadataTerm can only point to a single metadata (versus multiple concepts can be mapped to the same reference term). I’m fine with this model, and it generally follows from Mike’s use case, that a module creates a term to represent something it needs to be able to look up (e.g. “which encounter type should represent ‘admission’”).

But I understood Burke (and Wyclif) to be uncomfortable with using this module to do this kind of configuration, because it’s supposed to do mapping. Was I understanding that right? Or are we all on the same page about using this for the “single metadata” use case, and the disagreement is just about sets (e.g. “which encounter types (plural) should close a visit”?

jasonvena · December 27, 2015, 7:00am

Hi all,

New to OpenMRS - I just recently touched base with @raff and @kosmik and began reviewing the project history and notes (including the June and Dec 2015 design forums).

Based on Mikko’s thoughts above regarding the requirements (same here) and some of my own regarding the logical design (below), I think it may be worthwhile to capture the business requirements a bit more formally first (not too formal, just enough to clarify and aid understanding) and then iterate on the design before going too much further in implementation. I’m no expert in this regard but would like to help out with this if possible.

To @darius 's point above, my impression of the 1:1 mapping between metadata_term and openmrs_metadata is that if it is fully-qualified (namespace qualified), the relationship actually becomes many-to-one at the table/object level. I think that this started coming out nicely in the draft logical model, but hit a small snag:

So I started to look at this a bit more, and here’s where I’m at so far:

Granted, ORM (using Hibernate) will obviate some of the relational constructs, but my goal is to try to understand the model better. That said, I think I’m at a point that I’d like to go back to the requirements before continuing as well.

To summarize: new guy wants to help, thinks getting slightly more detailed requirements and a logical model “on paper” might be helpful.

raff · December 28, 2015, 2:37pm

@jasonvena, Let me refer to some of your comments from the diagram:

Metadata_term uniqueness is based on source and code thus a composite unique constraint should be defined for 2 columns metadata_source_id and code.
What is the term_surrogate_id for? metadata_term_id should be the primary key
Metadata_reference_uuid is not a foreign key. We will not use a database key as we will be referencing different tables based on metadata_class, which will point to the correct table (location, provider_attribute_type, etc.) and metadata_reference_uuid, which will point to the exact row.

The rest seems right to me. We’ll skip metadata_set for now.

jasonvena · December 29, 2015, 7:07am

Thanks for reviewing @raff much appreciated Review by item:

OK - a good clarification of which attributes are involved in the composite unique constraint, thanks.

Since metadata_term_id no longer participates in the composite uniqueness constraint, there is no 2NF violation if used as primary key. No need for another surrogate.

[quote=“raff, post:27, topic:3315”] Metadata_reference_uuid is not a foreign key. We will not use a database key as we will be referencing different tables based on metadata_class, which will point to the correct table (location, provider_attribute_type, etc.) and metadata_reference_uuid, which will point to the exact row. [/quote] The design page seemed to indicate that this is an addition to, and not a replacement for, the FK relationship (“instead of just”), thanks for clarifying

Agreed - metadata_set’s relationship with metadata_term does not appear to be completely decided. At present it’s both a many-to-many and an inheritance relationship, so design clarification may help to simplify things a bit.

kosmik · January 6, 2016, 2:57pm

Actually I chose to rename MetadataTerm as MetadataTermMapping earlier as it defines both the term and the mapping (at least in the original design). We might choose to tweak the terminology if the model/logic changes.

This is still unclear to me: How will the metadata mappings be defined in practice when looking at it from the perspective of a OpenMRS installation?

Does a module include liquibase updates (or whatever) that populate the database with terms and mappings so that the administrator does not need to do anything?
Or does the module only define the terms so that the administrator must take care of defining the mappings? (This implies that mapping is optional in the data model)
Something completely different?

kosmik · January 6, 2016, 3:38pm

I’m not sure but maybe there is a misunderstanding here? I admit the diagram might not be intuitive as I have used one-to-one relationships in the meaning “a extends b”. I added a note on the design page: “Note that in this diagram, the one-to-one mapping between metadata_source and openmrs_metadata, for example, indicates an inheritance in object design terms: MetadataSource inherits the fields of OpenmrsMetadata. In the database schema, this relationship will be implemented so that the relation metadata_source, for example, includes all the columns of openmrs_metadata.”

jasonvena · January 7, 2016, 5:24am

Hi Mikko, I do see what you’re saying but the diagram notation used (Information Engineering or IE) has standardized meanings and unfortunately doesn’t indicate the desired extension relationship.

Something I’ve come to appreciate from my draft relational data model above (and in thinking about Rafal’s feedback) is that we might want to start diagramming at a higher level of abstraction in the system using a more generalized notation like UML. We really can’t express ORM well using an IE diagram, and since the relationship between metadata_term and openmrs_metadata is not determined by relational db constraints at all (see Rafal’s 3rd bullet above), there’s not much of a story to tell unless we frame it in a bigger context.

As there are still design decisions outstanding, I’ll continue to think about possible ways to communicate diagramatically in a standards-based way, which hopefully would make life easier for everybody!

kosmik · January 7, 2016, 7:42am

I agree. A UML class diagram would support communication much better as a relational diagram is in my opinion more useful when discussing about db implementation details.

There might be a lesson to be learned here. When this project landed on my desk I made the naive assumption that the (undocumented) functional requirements have already been so well thought of that we only needed to focus on the implementation details. Now that we have hit a couple uncertainties/disagreements it would be very helpful if someone had documented the original requirements based on which the initial implementation design had evolved.

raff · January 7, 2016, 11:31am

I see 2 possibilities:

Some modules may install its own metadata, which is typically done in module’s activator on startup. In such a case a module will install metadata only if a mapping does not exist in the system yet. Following installation of metadata in module’s activator, both terms and mappings can be created by a module using API calls to MetadataMappingService. If metadata cannot be installed due to any reason e.g. duplicate names validation error, then the module may decide to create just terms and let an administrator set mappings to metadata, which exist in the system.
Some modules will only come with terms (added on startup from module’s activator), expecting an administrator to set mappings to metadata, which exist in the system.

Are there are any other outstanding design issues, which I am missing?

kosmik · January 7, 2016, 8:15pm

Thanks, this helps a lot. Actually I went so far as to create a wiki page for gathering this kind of use cases or workflows: Metadata Mapping - Ideas for Use Cases - Documentation - OpenMRS Wiki

Feel free to write you ideas and comments there. I know this comes a bit late in the process but I thought we would still benefit from any kind of written descriptions of what this module is about and what is out of scope. So why not give it a try? Am I making any sense to any of you out there?

Excluding sets, I can’t think of anything urgent except for @darius’s post which leaves some things hanging in the air as no one has commented on them:

raff · January 13, 2016, 12:13pm

@wyclif, @burke, would you like to comment on the above?

wyclif · January 13, 2016, 9:41pm

I recall we talked about this on some design call and as Darius says these mappings seem to differ from concept mappings which gets confusing at one point because it becomes had to distinguish them from tagging and we never seemed to come to an agreement when it comes to the sets

kosmik · January 23, 2016, 9:57am

I started another thread for access control related design. @raff, care to comment?

jasonvena · January 25, 2016, 6:36am

Good idea. I just added a use case diagram there to try to provide some perspective on what the module’s requirements look like.

Hopefully at some point we can generalize that page into something like a “design playground” for the module, a place to exchange design/purpose/scope ideas informally and prior to any official design documentation. To Wyclif’s point, I agree that the relationships between the existing metadata-associated initiatives are confusing - having a place to “think out loud” about them could help!

kosmik · February 1, 2016, 5:48pm

Great stuff!

kosmik · February 1, 2016, 5:49pm

Ok, let’s try to pull a release together.

The only remaining confusion seems to be whether a MetadataTermMapping’s reference to a OpenmrsMetadata should be optional or not. I say we make it optional, publish an alpha release and see how it goes. If everyone is happy we do a proper release, otherwise we try something else.

We have 3 tickets in design/progress for version 1.1.0: https://issues.openmrs.org/issues/?jql=fixVersion%20%3D%201.1.0%20AND%20project%20%3D%20MAP%20ORDER%20BY%20key%20ASC%2C%20status%20DESC

One of those tickets is still unassigned (nudge nudge).