Initializer to be extended to handle data (not just metadata)

mksd · January 12, 2019, 6:51pm

Well typically Iniz would support any unambiguous reference to an object, and would always supports UUIDs at minima. As a convenience the name - should it be unique - would be supported as well.

bistenes · January 14, 2019, 4:50pm

I’m working on a patient importer today. Ultimately I think Iniz should itself be configurable, since Identifiers and Attributes and a number of other things will vary by implementation. But for now I’ll implement per Dimitri’s recommendation.

mksd · January 14, 2019, 6:21pm

Cool, great @bistenes, looking forward to your PR. I guess that’ll be the chance to move it to /openmrs soon as well on GitHub.

bistenes · January 16, 2019, 3:35pm

Here’s the CSV spec, lemme know if anything should be different: patientsSpec.csv (846 Bytes)

I want to write some tests. I’m planning on writing a DomainPatientInitializerServiceTest, since it seems like there’s one of those for all of the domains.

mogoodrich · January 16, 2019, 4:09pm

I have been meaning to weigh in as well, looks like I had the same comment re: referencing identifier types… agree with @mksd that it should always support uuids, but that in cases like this it makes sense to support names as well, to make the CSV files actually human-readable.

Ideally, we also would support Metadata Mappings, which was developed to overcome the fact that metadata names are (generally) used for display and we’d like them to be tweakable/editable while on the other hand uuids are unique but not human-readable:

https://wiki.openmrs.org/display/docs/Metadata+Mapping+Module

There might be a bit of a chicken-and-egg thing going on here thought between Metadata Mapping and Metadata Deploy… looks like I added support to Metadata Mapping to allow deploying Mappings via Metadata Deploy (see my commits on Metadata Mapping from Jan 2017), but I’d also like Metadata Deploy to be able to reference metadata by mapping… (We may have had this discussion back in 2017, but I forget the details… )

Take care, Mark

mogoodrich · January 16, 2019, 4:18pm

fyi @mksd @bistenes ^^ just in case the Talk alert when out when I mistakenly posted only the first line of the above post…

bistenes · January 16, 2019, 6:39pm

@mksd Also, how do you feel about JodaTime?

mksd · January 16, 2019, 7:57pm

Ok first of all I saw that you included only a subset of Person's members to be covered (which is for the most part what a Patient is as you know), was that because you only wanted to bring attention to a couple of more complicated ones: address, person attribute types and patient identifier types? (it’s person attributes btw).

I think the way you’ve dealt with those is good. However for the labels you could just go for a user friendly natural reading in Iniz. I wouldn’t do name.given but instead Given name, … etc.

We could just go ahead with a ‘patient’ domain, but what about a ‘person’ domain? If we wanted to be smart and ensure that most of the parsing for one can be used with the other, then this requires some thinking. Of course if we go and shoot straight to nailing the ‘patient’ domain then that’s easier. I would be ok with this for now, it’s surely sufficient.

I think Joda-Time is fine, I thought it was a Core dependency actually, but it isn’t. Anyway if you need to include it just do so.

bistenes · January 16, 2019, 8:13pm

Right now I’m implementing the fields that I need. To implement everything in Person & Patient would be a pretty big project.

But I suppose I should design these field specs so that when someone wants to make the CSV parser more powerful, to take advantage of more OpenMRS fields, they don’t have to wrestle for backwards compatability. I think the only place where that’s an issue here is the spec for name, which I suppose will have to be more like

names
nameField:value,nameField:value;...
e.g.  "givenName:Brandon,familyName:Istenes;givenName:Brandleberry,familyNameSuffix:Jr"

Should I implement in that way?

Good thought about Person/Patient parsers. I think it should be easy and good to create a Person domain and have PatientLineProcessor.fill call PersonLineProcessor.fill.

Oh yeah, and I realized personAttributes won’t be included in this first version either, since we don’t use them at all in Mexico… everything is a registration encounter concept for us. But we will need that for Haiti down the line.

mksd · January 16, 2019, 8:25pm

Good thought about Person/Patient parsers. I think it should be easy and good to create a Person domain and have PatientLineProcessor.fill call PersonLineProcessor.fill .

+1, totally.

Mmm, it looks like PersonName is just always those three fields, so I would tend to think that’s it’d be more useable and readable to have three columns to represent it. It’s not challenging like an address where the address structure could change from person to person.

bistenes · January 16, 2019, 8:32pm

We also have familyNamePrefix, familyNameSuffix, familyName2, and degree. And also a Person can have arbitrarily many names. If we want to exclude the possibility of these other types of name fields, I suppose we could do first:middle:last;.... But that would make backwards compatibility difficult when someone does eventually want familyNameSuffix or whatever.

mksd · January 16, 2019, 8:33pm

For some reason though it looks like a person can hold multiple names… see here. I wonder why that is…

Anyway each name section could be a list:

…	Given name	Middle name	Family name	…
…	John, Jack	, The	Doe, Ripper	…

bistenes · January 16, 2019, 8:36pm

My feeling is that parallel arrays would be unpleasant both to write code for and to use.

mksd · January 16, 2019, 8:36pm

I would start with a "base“ person line processor that only cares about first, middle and family names. It will always be possible later to make it process more members of PersonName. At least I would try to do something along those lines, which should be possible since as many line processors as wanted can be combined (based on a version if I remember correctly).

mksd · January 16, 2019, 8:38pm

True, but I’m betting that the multiple names person case should be uncommon to say the least.

bistenes · January 16, 2019, 8:40pm

That’s fair. And good point about being able to add other fields later.

As an aside, is this how nicknames are stored?

mksd · January 16, 2019, 8:52pm

In which distribution?

Without thinking too much I would store a nickname as a person attribute. But for sure if given name and family name are not mandatory together, using a second person name is also an option, although, how would you know which one is the nickname? Java sets don’t provide any order guarantee (cfr here).

mogoodrich · January 17, 2019, 12:07am

In the PIH EMR our Haiti sites wanted a nickname but didn’t need a middle name s so we just used the middle name field… it’s really just the display name of the middle name field that we set, otherwise it’s just a middle name iirc:

github.com

PIH/openmrs-module-pihcore/blob/master/api/src/main/java/org/openmrs/module/pihcore/setup/NameTemplateSetup.java#L21


public class NameTemplateSetup {


public static void configureNameTemplate(NameSupport nameSupport) {


    NameTemplate nameTemplate = new NameTemplate();
    nameTemplate.setCodeName("short");  // we are redefining the short name template for use in our context


    Map<String,String> nameMappings = new HashMap<String, String>();
    nameMappings.put("givenName", "zl.givenName");
    nameMappings.put("familyName", "zl.familyName");
    nameMappings.put("middleName", "zl.nickname");
    nameTemplate.setNameMappings(nameMappings);


    Map<String,String> sizeMappings = new HashMap<String, String>();
    sizeMappings.put("givenName", "50");
    sizeMappings.put("familyName", "50");
    sizeMappings.put("middleName", "50");
    nameTemplate.setSizeMappings(sizeMappings);


    List<String> lineByLineFormat = new ArrayList<String>();
    lineByLineFormat.add("familyName,");

I agree with just supporting a single name with the basic fields for now.

For reference, multiple names aren’t super uncommon… looking at our Mirebalais DB I see on the order of thousands of patients with multiple names (assuming I did my query right). The two common use cases are:

Merging two patients together
Updating a patient name (assuming we save the old name when we do this… I’d need to check to confirm)

Again, though, I don’t think we need to support this, because I think in most cases when importing data we’d have a single name per person… or would want to take the opportunity to go through a data cleaning exercise anyway to get down to a single name.

Take care, Mark

mksd · January 17, 2019, 8:08am

Thanks @mogoodrich, I realise that this is the most likely reason why there is a set of PersonName for each Person in the data model.

bistenes · January 17, 2019, 4:04pm

@mksd I opened a WIP PR, could you let me know if everything looks like it’s on the right track? https://github.com/mekomsolutions/openmrs-module-initializer/pull/1