This topic will serve as the central place for planning and tracking the Metadata Extraction Tool project throughout my OpenMRS Fellowship.
The goal of this project is to develop a Java-based utility that reads metadata from a live OpenMRS instance and exports it into Initializer compatible CSV files, making it easier to replicate existing OpenMRS configurations across different deployments.
Objectives
Define the metadata domains to be supported.
Design a modular and extensible extraction architecture.
Implement metadata exporters domain by domain.
Generate Initializer-compatible CSV output.
Ensure correctness through unit and integration testing.
Document the architecture and usage for future contributors.
Development Plan
The project will be completed incrementally:
Finalize the extraction scope with mentors.
Create Jira epics and implementation tasks.
Implement each metadata domain individually.
Add tests for each exporter.
Improve documentation and gather community feedback.
Project Tracking
I’ll be sharing the Jira board, implementation tasks, and design updates here as they become available, so the community can easily follow progress and provide feedback.
Repository:
I welcome suggestions, questions, and feedback throughout the project. Thanks in advance for your support!
@bawanthathilan - great initiative. We’ve started a few things like this in the past at PIH but never really taken them through to completion for various reasons.
Here is something I threw together somewhat recently as the start of an effort to move one of our implementations to Initializer, where my main initial goal was to just get their concepts and concept-related dependencies into code, but where I was planning for this to be something we could extend to other domains as necessary:
For Concepts specifically, one might question why I bothered with the Concepts domain when OCL is generally the preferred route, and the reason was because the starting dictionary I was working with had evolved since the earliest days of OpenMRS, and many of the concepts as-is would fail validation if re-saved or imported into a new system. Exporting Concepts as-is into the concepts domain allowed us to quickly iterate on identifying issues and cleaning them up prior to the more expensive and permanent process of putting them into OCL. This is also why I added some other tools into this repo to help along the way (eg. identify if concepts were used by analyzing foreign key references and other usages, etc)
Keep in mind when building a tool like this that many of the domains are quite customizable. For example, with Locations - some might choose to manage the tags associated with each location as additional columns in the location domain, others might prefer to use the locationtagmaps domain. Concepts is like this especially, and another complication is that many dictionaries may have inherited additional concept names or mappings that they don’t need or want to preserve. So it is not always a one-size fits all situation.
Thanks Mike , really appreciate you sharing this and the context behind it the ConceptExporter approach.
That point about domain customization makes a lot of sense too, especially with Locations and Concepts having multiple valid ways to model the same thing. @wikumc and I will keep that in mind so the tool doesn’t assume a one size fits all approach, maybe by allowing some configurability in how each domain gets exported.
Hi everyone! I’m creating a draft wiki page for the Metadata Extraction Tool. It’s still a work in progress, but I’d appreciate any feedback or suggestions. @wikumc@jayasanka@ibacher@dkayiwa