Hi OpenMRS Community,
My name is Ashar Ali, and I’m a Data Engineer exploring ways to build a modern ETL/analytics pipeline for OpenMRS.
From my research, I understand that while OpenMRS captures rich clinical data, there isn’t currently a fully-featured, production-grade pipeline that can:
-
Extract data from OpenMRS databases safely
-
Transform/clean/normalize the data for analytics
-
Load it into a warehouse or analytics-ready schema
-
Support monitoring, logging, and scheduling for repeatable runs
I also studied the existing MambaETL module and see it as a great reference, but it seems limited in orchestration, monitoring, and multi-hospital support.
I want to develop a pipeline that is modular, secure, and usable by hospitals, even with local deployments. Before starting, I’d love to get feedback from the community:
-
Are these the main challenges hospitals face regarding analytics and reporting?
-
Are there specific analytics patterns, KPIs, or reports that would be most valuable?
-
Would hospitals be open to testing such a pipeline with demo or synthetic data first?
Any insights, suggestions, or guidance from implementers, developers, or hospital IT staff would be highly appreciated. I want to make sure the solution addresses real-world needs.
Thank you for your time and guidance!