When working with OpenMRS patient program workflow states, we’ve encountered a fundamental data model limitation in determining which state represents the “current” or “final” state of a completed program enrollment.
The patient_state table structure:
+-------------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------------+--------------+------+-----+---------+----------------+
| patient_state_id | int | NO | PRI | NULL | auto_increment |
| patient_program_id | int | NO | MUL | 0 | |
| state | int | NO | MUL | 0 | |
| start_date | date | YES | | NULL | |
| end_date | date | YES | | NULL | |
| creator | int | NO | MUL | 0 | |
| date_created | datetime | NO | | NULL | |
| changed_by | int | YES | MUL | NULL | |
| date_changed | datetime | YES | | NULL | |
+-------------------------+--------------+------+-----+---------+----------------+
Key Observation: start_date and end_date are DATE fields (not DATETIME).
For active program enrollments, determining the current state is straightforward:
- Find the state where
end_dateIS NULL - This works correctly as per OpenMRS design
However, for completed program enrollments where all states have end_date IS NOT NULL, determining the “most recent” or “final” state becomes impossible to determine accurately when multiple state transitions occur on the same calendar day.
Real-world example
A TB patient’s treatment program on June 15, 2024:
- 09:00 AM: Patient enters “Initial Treatment” state
- 11:30 AM: Clinician updates to “Intensive Phase”
- 02:45 PM: Patient moved to “Continuation Phase”
- Program completes December 31, 2024
Database records:
- State A: start_date = ‘2024-06-15’, end_date = ‘2024-06-15’ (Initial Treatment)
- State B: start_date = ‘2024-06-15’, end_date = ‘2024-06-15’ (Intensive Phase)
- State C: start_date = ‘2024-06-15’, end_date = ‘2024-12-31’ (Continuation Phase)
Now when querying for the “most recent state” on 2024-06-15, there is no way to determine the correct order because:
- All three states have start_date = ‘2024-06-15’
- States A and B both have end_date = ‘2024-06-15’
- The DATE type fields contain no time information
Sorting by end_date alone would show State C as most recent, but if a clinician corrected an error and updated State A or B later that same day, we have no field in the data model that captures the temporal order of these transitions within the same day.
A potential solution is to change start_date and end_date to DATETIME, modifying the schema to capture time precision.
We’d love to hear from the community on how best this can be handled and how we can help out in here.
cc: @angshuonline, @mohant, @ibacher, @grace, @raff, @burke, @dkayiwa, @mseaton