PUTing instead of POSTing when Uploading Fhir data to a FHIR store

Regarding the way we Upload resources to a FHIR store , We currently do a POST, ie POST : fhir_base_url/Patient
Note that , when POSTing , it has the following downsides

  • The hapi Fhir server will generate a unique internal id for the new Resource ,this could be sequential or a uuid depending on the server_id_mode
  • This can lead to resource duplication ,Because even when the same Resource is POSTed several times , the server will simply assign it a new internal ID
  • The final resource ID in the HAPI FHIR server will be different from the original Client (OpenMRS) id which is a uuid .

I would Propose we instead do a PUT ,
PUT : fhir_base_url/Patient/patient_uuid
ie
client.update().resource(resource).withId(openmrsUuid).execute()
Note that With PUTing , it has the following benefits ,

  • The HapiFhir server will Persist the Client_id ,hence we keep same Resource IDS in the HAPI FHIR as in OpenMRS
  • This guards against duplicating data in the HAPI FHIR , Only a new Resource is Created if the Resource Id doesn’t exist yet ,other wise it simply updates the existing resource or doesn’t do anything if there no resource modification.

cc @bashir @ibacher @akimaina @k.joseph

I think a PUT makes sense to me @mozzy, have you tested the 2 requests out against hapi fhir and confirmed this behavior? are there any cases when you could consider a POST?

Ya , sure … .

Probably when the client doesnt want to trouble them selves with the Resource ID generation

It’s worth noting that this will work, but in somewhat limited circumstances (i.e., I’m not sure most FHIR servers support client-generated resource ids). I don’t see any problem with this approach for the PLIR work, but we should flag this as an issue for talking to a generic FHIR end-point.

The more generic way of handling things would be to take advantage of FHIR’s conditional update functionality to perform something closer to an upsert operation; the down-side is that this may require per-resource specific handling and some sensible business identifier for every resource (the advantage is that we could use something like OpenMRS UUID as that business identifier).

2 Likes

As others have suggested and in line with FHIR’s RESTful API docs:

  • PUT should be used to update a resource
  • POST should be used to create a resource unless “the client wishes to have control over the id of a newly submitted resource”

Client-generated IDs (i.e., UUIDs) are useful when independent clients are generating universally unique resources independently and have the ability to create proper Version 4 UUIDs. Resources that are shared across clients (i.e., most resources like Patients, Encounters) are probably better created centrally (i.e., using a POST) to ensure uniqueness. Any clients generating novel data locally might want to create the ID for resources, but should only do so with a proper uuid module (i.e., not using a bespoke uuid generator depending on Math.random).[ref]

1 Like

I also think that using PUT is fine to preserve resources IDs and prevent duplicate resources. As a matter of fact, in the first versions of Analytics pipelines, we were configuring the GCP FHIR store to enableUpdateCreate and in FhirStoreUtil we were using PUT for similar reasons. This was changed when we migrated to using HAPI FHIR client here.

Adding @pmanko as an FYI to make sure changing this behavior is fine for your use-case.

1 Like

In addition ,
When we are referencing other resources say in the Observation Resource ,
ie

{
       "resourceType": "Observation",
......

"subject": {
              "reference": "Patient/client_id",
              "type": "Patient",
              "display": "obs_patient2 obs_patient2 obs_patient2 (OpenMRS ID: 10003P)"
             }
}

We reference to the resource id that was originally created by the client.

and when hapi doesn’t find a corresponding referenced resource id , it creates a new dummy resource to act as a reference resource.

ie

{
    "resourceType": "Patient",
    "id": "client_id",
    "meta": {
        "versionId": "1",
        "lastUpdated": "2021-01-27T09:04:30.491+00:00"
    },
    "text": {
        "status": "generated",
        "div": "<div xmlns=\"http://www.w3.org/1999/xhtml\"><table class=\"hapiPropertyTable\"><tbody/></table></div>"
    }
}

so we end up with unnecessary duplications .

to solve this we just have to implement PUT both in streaming and Batch modes ,that is Uploading a single resource or a bundle.

When we Upload Resources with reference resources ,

  • For the Case when POSTing ,
    Since its hapi that will generate the resource id ,
    it will automatically generate new dummy resource for the referenced resource.

    Even when the actual resource (reference) is POSTed afterwords , still hapi will create it as a new Resource with a new Resource ID .

    Note that when the _include param is added on a fhir search to include the referenced resources , ie _include=Observation:patient
    it will include only the dummy resources created and will not include the actual Resources.

    see real example i pasted here

  • For the Case when PUTing ,
    Since HAPI persists the client Resource ID , in case hapi doesnt find the referenced resource id , it will generate a dummy resource. But later when the actual resource (reference) is PUTed , hapi will simply update the existing dummy resource it created , since they will have the same id.

    By this we shall avoid alot of duplication in the fhir store

NOTE : Am referencing HAPI , beacuse atleast it’s a complete implementation of the HL7 FHIR

1 Like

Thanks @mozzy for further investigation and yes in the first prototype, we were also disabling Referential Integrity check on GCP FHIR Store (here). We wanted to keep the original IDs to be able to join different resources, e.g., Patient and Observation.

So to do this POST to PUT change, we just need to change the create() here to update(); is that right?

Seems like a very long conversation to just change one function call! :slight_smile:

2 Likes

sure , and also Updating the the Batch mode to PUT the individual Resources instaed of POSTing them

2 Likes

Created new issue to handle this