🚨 Investigating Timeout Bottlenecks in OpenMRS Endpoints

jayg · June 30, 2025, 7:56pm

Hey everyone!

As part of my @GSOC project with @OpenMRS, I’ve been working on improving the performance of the OpenMRS 3.x backend. In this thread, I’ll walk through specific API endpoints that are facing timeout issues, backed by metrics from our Gatling testing reports.

Report: https://o3-performance.openmrs.org/index.html

URL 1:

/openmrs/ws/rest/v1/patient – Get Patients

Used to fetch patient data with detailed fields.

Request URL:

/openmrs/ws/rest/v1/patient?q=SEARCH_QUERY&v=custom:(patientId,uuid,identifiers,display,patientIdentifier:(uuid,identifier),person:(gender,age,birthdate,birthdateEstimated,personName,addresses,display,dead,deathDate),attributes:(value,attributeType:(uuid,display)))&includeDead=false&limit=50&totalCount=true

Timeout %: 28.64%

Median Response Time: ~51s

URL 2:

/openmrs/ws/rest/v1/emrapi/inpatient/request – Get Inpatient Request

Used to fetch inpatient data related to admission/transfer requests.

Request URL:

/openmrs/ws/rest/v1/emrapi/inpatient/request?dispositionType=ADMIT,TRANSFER&dispositionLocation=LOCATION_UUID&v=custom:(dispositionLocation,dispositionType,disposition,dispositionEncounter:full,patient:(uuid,identifiers,voided,person:(uuid,display,gender,age,birthdate,birthtime,preferredName,preferredAddress,dead,deathDate)),dispositionObsGroup,visit)

Timeout %: 4.05%

95th Percentile: ~49s

URL 3:

/openmrs/ws/fhir2/R4/Observation – Get Lab Results of Patient

Fetches lab observations via FHIR for a given patient.

Request URL:

/openmrs/ws/fhir2/R4/Observation?category=laboratory&patient=PATIENT_UUID&_count=100&_summary=data

Timeout %: 2.95%

Max Response Time: ~60s

URL 4:

/openmrs/ws/fhir2/R4/Observation – Get Patient Observations

Returns filtered clinical observations for the patient using concept codes.

Request URL:

/openmrs/ws/fhir2/R4/Observation?subject:Patient=PATIENT_UUID&code=OBSERVATION_CODES&_summary=data&_sort=-date&_count=100

Timeout %: 2.93%

99th Percentile: ~60s

A Key Insight

In our current setup, only 250 patients are repeatedly used across simulations. Over time, this leads to data buildup per patient — including clinical forms, observations, and historical records.

For some endpoints (like observations and lab results), this accumulated load per patient could be a major contributor to timeouts.

But it’s also important to note: not all slowdowns are due to this.

@dkayiwa @ibacher @jayasanka @bawanthathilan

dkayiwa · June 30, 2025, 11:02pm

Does the timeout percentage depend on the search query parameter?

jayg · July 1, 2025, 4:37am

We use “jay” as a search query parameter and it stays static.

dkayiwa · July 1, 2025, 5:49pm

Are we able to look at backend logs for timeouts?

raff · July 2, 2025, 9:22am

Timeout happens when request takes over 60s to respond. It’s very likely that O3-4765: Improved FHIR Get Lab Results endpoint performance by rkorytkowski · Pull Request #569 · openmrs/openmrs-module-fhir2 · GitHub will improve performance for FHIR endpoints using Concepts as it introduced cache for Concept resource so I would re-test when merged.

jayg · July 2, 2025, 2:43pm

The server logs are available for the respective runs in the links below

The runs with name Run Performance Tests are the once to open and once opened the server-logs are present in the artifacts( server-logs).

jayg · July 31, 2025, 3:28pm

FHIR Requests Used in Performance Tests

Hi team,

As part of our ongoing performance evaluation, below is a compiled list of all FHIR requests currently used in our performance testing suite:

Location-related Requests

GET /Location?_tag=Transfer+Location&partof:below={BED_ASSIGNMENT_UUID}&_count=15&_getpagesoffset=0 – Get Transferable Locations
GET /Location?_count=1&_summary=data – Get Locations Search Set
GET /Location?_summary=data&_count=50&_tag=Login+Location – Get Locations

Patient-related Requests

GET /Patient/{patientUuid}?_summary=data – Get Patient Summary Data

Medication-related Requests

GET /MedicationRequest?encounter={encounterUuid}&_revinclude=MedicationDispense:prescription&_include=MedicationRequest:encounter&_summary=data – Get Specific Medication Requests
GET /MedicationRequest/{medicationRequestUuid}?_summary=data – Get Medication Request by UUID
GET /Medication/{medicationUuid}?_summary=data – Get Medication by UUID
GET /Medication?code={code}&_summary=data – Search Medication by Code

Observation and Condition Requests

GET /Observation?subject:Patient={patientUuid}&code={codesParam}&_summary=data&_sort=-date&_count=100 – Get Observations for Patient
GET /Observation?category=laboratory&patient={patientUuid}&_count=100&_summary=data – Get Lab Results of Patient
GET /Condition?patient={patientUuid}&_count=100&_summary=data – Get Patient Conditions

Allergy & Immunization Requests

GET /AllergyIntolerance?patient={patientUuid}&_summary=data – Get Allergies of Patient (duplicated)
GET /Immunization?patient={patientUuid}&_summary=data – Get Immunizations of Patient

Encounter-related Requests

GET /Encounter?_query=encountersWithMedicationRequests&date=ge{encoded}&_getpagesoffset=0&_count=10&status=active&_summary=data – Get Medication Request Encounters
GET /Encounter?patient={patientUuid}&_sort=-date&_count=1&type={VISIT_NOTE_ENCOUNTER_TYPE_UUID}&_summary=data – Get Latest FHIR Encounter

ValueSet

GET /ValueSet/{valueSetUuid}?_summary=data – Get ValueSet by UUID

Could you please help identify which of these endpoints are likely to be impacted by the recent changes ?

Based on that, I can share:

Previous performance stats for those specific requests
Any other metrics or details you’d like to see for deeper analysis

grace · August 1, 2025, 4:56am

@raff @dkayiwa what do you think?

raff · August 1, 2025, 8:19am

@jayg thanks. Get Lab Results of Patient was specifically addressed in Jira

I’d also check all except for Location-related Requests. Basically anything related to observations and concepts might have been improved by this change.

Ideally we would use a tool like GitHub - nuxeo/gatling-report: Parse Galting simulation.log files to output CSV stats or build HTML reports with Plotly charts. and/or GitHub - DennisRippinger/gatling-reporter: Work with Gatling reports to compare results between runs.

dkayiwa · August 1, 2025, 12:30pm

@jayg would it be too much work to compare all? The reason i am suggesting this is because sometimes we make changes that affect other unexpected areas. So it feels safer if we can confirm that those other unexpected places did not end up with performance degradations.

jayg · August 1, 2025, 2:05pm

It wont be a lot of work i will try to integrate the reports over time using the tools mention by @raff , which will make it very useful and easy, will try to be done by Monday.

jayg · August 1, 2025, 3:57pm

Hi @raff i have throughly looked into these tools, these tools are no more in working condition.

Starting with Gatling 3.7, the simulation.log format was changed from a text-based log (TSV) to a high-performance binary format. This makes the parser tools not viable for use in current versions.

Will try to look for alternative ways but in meantime should i send the differences in performance manually?

raff · August 1, 2025, 6:05pm

I see! Yes, let’s have the manual comparison.

I’d try asking chatgpt or claude to compare results?

jayg · August 2, 2025, 4:42am

I have figured out on how to parse, i will be generating trends graph soon.

jayg · August 2, 2025, 9:28pm

Hi @raff, @dkayiwa

I have a preliminary version of the trend representation working. To backfill the historical data, could you please provide the deployment date for the relevant changes? My implementation will begin consuming data from today onwards.

(Note: This is a very basic version and might be sensitive to future Gatling updates).

cc: @jayasanka @bawanthathilan

raff · August 13, 2025, 10:54am

Thanks @jayg ! We would be interested in comparing current results with the deployment from Add CACHE_BUST work-around for backend · openmrs/openmrs-distro-referenceapplication@0ef8c6e · GitHub or before.