AI Integrations and OpenMRS

miirochristopher · October 30, 2025, 11:53pm

Hello, as part of the @EMR4All team, I am excited to share new developments and milestones regarding A.I integration with OpenMRS. What started as a simple Python Agent has now evolved into a near ready to use A.I application. Here is a walk through a few lessons, discoveries, challenges and solutions so far.

We decided to look into how to solve the challenges that we faced with the very first agent (using Python & its ecosystem of tools/libraries) and then we could apply the solutions (wherever applicable) to the new Java OpenMRS module we’re working on; with perhaps more features than just text to SQL (plain English queries getting responses and the corresponding SQL).

The biggest huddle yet was hallucination; we realized that with repetitive testing/validation of the generated SQL and being very explicit in our queries/prompts we would reduce hallucination by a good measure!

Also, the more you tell A.I about the subject (i.e the more context provided) the better the responses, we went from the Models using our exact words in the queries as table/column names when generating queries to more sensible queries and thus responses.

We also provided the data model to the LLM in YML format, detailing the tables, columns and relationships which we demand the model to use in the prompt(s). We also made certain that the system automatically standardizes queries. For instance; the Person table is used for demographic data, patient table for clinical data. While it may seem like it is obvious to know that person_id = patient_id and that these two tables have no direct relationship (foreign keys), it was NOT! so we had to clearly state that and how we would join them when required. So we have functions to flag the generated queries correct or incorrect based on the facts as described in the data model, with other functions making minor adjustments for better results.

The other major issue was resources to run the Ollama, OpenMRS as well as the A.I application(s)/agents, and to solve that, we have moved the A.I agents/applications from the Ollama compose file making each of the stacks accessible/off-loadable from/to different servers. We provided a REST API that other applications such as the ESMs can query instead of connecting to Ollama directly.

These among many other things that we have done, have enabled us guarantee more accurate responses with minimal resources and we know it will only get better with more refinement. Here are the examples..

You are an expert SQL query generator for OpenMRS (Open Medical Record System). Your task is to translate natural language questions into accurate SQL queries.

**CRITICAL CONSTRAINT - READ CAREFULLY:**
You MUST use ONLY the exact table names and column names listed in the OpenMRS data model below. NO EXCEPTIONS!

**OPENMRS DATA MODEL (USE ONLY THESE TABLES AND COLUMNS):**
{table_schemas_description}

**CRITICAL TABLE USAGE RULES:**
1. **PERSON TABLE**: Use ONLY for demographic data (gender, birthdate, age, address, etc.)
2. **PATIENT TABLE**: Use ONLY for clinical data and patient identification
3. **ENCOUNTER TABLE**: Use for visit/encounter data
4. **OBS TABLE**: Use for clinical observations and measurements
5. **IMPORTANT**: patient_id in patient table = person_id in person table (they contain the same values)

**CRITICAL COLUMN RULES:**
1. NEVER use person.patient_id (this column doesn't exist)
2. NEVER use patient.person_id (this column doesn't exist)
3. Use person.person_id for demographic data
4. Use patient.patient_id for clinical data
5. Use encounter.patient_id to link encounters to patients
6. Use obs.person_id to link observations to persons

**QUERY PATTERNS:**
- For demographic data: SELECT FROM person
- For clinical data: SELECT FROM patient JOIN encounter ON patient.patient_id = encounter.patient_id
- For observations: SELECT FROM obs JOIN person ON obs.person_id = person.person_id
- For encounters with observations: SELECT FROM encounter JOIN obs ON encounter.encounter_id = obs.encounter_id

**DATE FIELDS IN THE DATA MODEL:**
{format_date_fields_for_prompt()}

**DATE FIELD MAPPINGS:**
- For patient registration/enrollment: use patient.date_created
- For encounter visits: use encounter.encounter_datetime or encounter.date_created
- For observations: use obs.obs_datetime or obs.date_created
- For any "when" or "time" related questions: use the appropriate date_created field for the main table

**COMMON DATE FIELD PATTERNS:**
- Most tables have date_created, date_changed, and date_voided fields
- encounter table has encounter_datetime
- obs table has obs_datetime
- Use these fields for any date/time filtering

**EXAMPLE DATE FILTERS:**
- "this month": WHERE MONTH(date_created) = MONTH(CURRENT_DATE()) AND YEAR(date_created) = YEAR(CURRENT_DATE())
- "last week": WHERE date_created >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)
- "today": WHERE DATE(date_created) = CURRENT_DATE()

**STRICT VALIDATION RULES:**
1. Every table name in your query MUST exactly match a table name from the data model above
2. Every column name must exist in the corresponding table from the data model above
3. DO NOT invent or guess table names or column names
4. If you cannot find the required tables/columns in the data model, state that clearly
5. Always use table aliases for JOINs to make queries readable
6. Verify that foreign key relationships exist before using them in JOINs
7. For date/time filtering, ONLY use the date fields listed in the "DATE FIELDS" section above
8. DO NOT hallucinate date fields like "date_enrolled", "date_registered", "visit_date", etc.
9. If a question refers to a concept like "enrollment" or "registration", use the appropriate date_created field

**QUERY CAPABILITIES:**
- SELECT with JOINs, GROUP BY, HAVING, ORDER BY
- Subqueries and CTEs (Common Table Expressions)
- Aggregate functions (COUNT, SUM, AVG, MAX, MIN)
- Date/time functions and calculations using ONLY the date fields listed above
- String operations and pattern matching

**Question:** {question}

**Thought Process:**
1. Identify required data from the question (demographic vs clinical)
2. Use person table for demographic data
3. Use patient table for clinical data
4. Use appropriate JOINs to link tables based on person_id = patient_id
5. Write the complete SQL query using only data model tables/columns

tendomart · October 31, 2025, 6:19am

Thanks @miirochristopher

dkayiwa · November 2, 2025, 10:18pm

Are these serving as evals?

veronica · November 3, 2025, 12:20pm

This is great and interesting stuff. I am curious to know more about how your validation process look like? Do you have a test suite of queries with known correct results? When dealing with counts, do you cross-check with similar queries but of line-lists in nature?

tendomart · November 3, 2025, 1:21pm

Join us tonight at AI - Continued Discussions

miirochristopher · November 3, 2025, 1:32pm

We do have functions that are the equivalent of the Evaluator API / Interface @dkayiwa provided by Spring AI; that evaluate the correctness of the final answers. For this case/function, however, “So we have functions to flag the generated queries correct or incorrect based on the facts as described in the data model”, the functions very specifically lookup wrong column/table names or combinations in the queries and adjust the output query to the specs of the data model before it is run against the database. The actual evals validate the overall responses. We kind of look to standardize things though for the Java implementation.

miirochristopher · November 3, 2025, 1:57pm

@veronica for a more generalized response, we do have tests that have expected results so we would know when the LLMs hallucinate; for instance;

	@Test
	public void chat_should_generate_valid_response() {

		UserMessage userMessage = UserMessage.from("What is the name of the process by which the body breaks down food?");
		ChatResponse response = model.chat(userMessage);
		
		AiMessage aiMessage = response.aiMessage();
		assertThat(aiMessage.text())
				.contains("digestion");
		assertThat(aiMessage.toolExecutionRequests()).isEmpty();

		ChatResponseMetadata metadata = response.metadata();
		assertThat(metadata.modelName())
				.isEqualTo(MODEL_NAME);
		
		TokenUsage tokenUsage = metadata.tokenUsage();
		assertThat(tokenUsage.inputTokenCount()).isPositive();
		assertThat(tokenUsage.outputTokenCount()).isPositive();
		assertThat(tokenUsage.totalTokenCount())
				.isEqualTo(tokenUsage.inputTokenCount() + tokenUsage.outputTokenCount());
		assertThat(metadata.finishReason())
				.isEqualTo(FinishReason.STOP);
	}

The other thing that we have done as well is run a database with a dataset that we are aware of/very familiar with; We always run the generated SQL against the database manually and verify we get the same exact response(s)…

We’ve also provided an interface to run custom queries or even the generated sql (Obviously we RESTRICT Data Manipulation Language commands/statements. INSERT, UPDATE, and DELETE).

miirochristopher · November 3, 2025, 2:30pm

Emr4all-ai-arch.pdf (59.6 KB)

Here is the proposed Architecture, kindly feel free to suggest changes and/or ideas!…The goal is to have multiple Agents handling different tasks and communicating with each other via ACP; and at the same time have our MCP severs set up for providing OpenMRS context.

dkayiwa · November 5, 2025, 8:33pm

What do you think about this, in the Thoughtworks Technology Radar which has just come out today? https://www.thoughtworks.com/radar/techniques/text-to-sql

miirochristopher · November 10, 2025, 5:31pm

I think most of the issues we faced are explained with clear solutions and reasons for each of the approaches. Having the schema provided to the LLMs and the vigorous validation of both the queries and output kind of provide the basic implementation. These along with restriction of data modification queries addresses/provides the bare-minimum requirements. (In the Python module) . For the Java implementation, however, as we pointed out in the new architecture, we will use MCP and will definitely look into some of the frameworks and/or approaches listed in the Thoughtworks Technology Radar article. Thanks @dkayiwa

tendomart · November 17, 2025, 3:43pm

Thanks for sharing this radar @dkayiwa , yes by the way since the LLM outputs are unpredictable making accuracy metrics challenging at that point.

Solution is adding abstraction layers which are tools for agents. These tools are in our context OpenMRS API's / Rest endpoints / Fhir endpoints which have been tested.

In development also is documentation on how to develop & use agents instead , with agents we can connect LLM’s and tell them to call specific and tested API’s instead of writing Queries directly against a database and you’re sure , it’s calling the right tool for the job.

For Example we’d have an agent like this or an MCP server still acting as the tool for this agent

import com.google.cloud.verticals.foundations.dataharmonization.plugin.FunctionTool;
import com.google.cloud.verticals.foundations.dataharmonization.plugins.genai.LlmAgent;
// other necessary imports

public class OpenMrsAgent {
    public LlmAgent createAgent() {
        return LlmAgent.builder()
            .instruction("You are a helpful assistant that can fetch patient information from an OpenMRS system. Use the provided tools to retrieve patient visits when requested by the user.")
            .addTool(FunctionTool.create(PatientVisitTools.class, "getPatientVisits")) // Register the tool
            .build();
    }
}

so if we’re interested in Research and Reporting, we could now have agents specified for this.

Tradeoff is the researcher will now just ask in plain english without burning some hours writing error prone SQL himself e.g

" Write a report for me a about visits for the last 5 months", SQL may be more finegrained, but since we’re now calling an API, we now know the Tool provided is accurate i.e

String url = BASE_URL + "visit?patient=" + patientUuid + "&v=full";

We can do Observation into the agent’s decision-making process and Evaluation based on output before eventually rolling them out into production.

So in summary, Text to SQL may be on hold, but going around it with agents instead could serve a similar purpose .

what do you think ?

dkayiwa · November 17, 2025, 8:51pm

A demo would be worth a million tokens.

tendomart · November 18, 2025, 3:29am

Will definitely share soon

miirochristopher · December 29, 2025, 11:53pm

As we finalize the tools and MCP abstraction…lets prepare for demos that actually get us to see how meaningful contextual answers can be obtained from LLMs!! Video Series