Approach/tool for keeping track of many OpenMRS Implementations

Continuing the discussion from Letting a module require configuration at first-time setup (for Atlas):

@ssmusoke brings up a concept here that I’m pretty sure many people have worked on before. How does a central group monitor many OpenMRS installations?

It would be nice to consolidate different approaches and lessons in this Talk topic?

I believe that @ball and @mseaton at PIH use new New Relic for this. I don’t think they did anything OpenMRS-specific, but just register the server’s. Maybe they can comment more?

I think that @cine and @hamish have built monitoring tools for use in Rwanda that collects OpenMRS details and server stats. Is this code ready yet? Is it sharable? Any lessons learned?

I think that KenyaEMR has some approach for this, and perhaps @ningosi or @gitahi86 can tell us about it. (I vaguely remember some off-the-shelf system monitoring tool, paired with a custom stats page. Maybe this?)

@darius to make the topic even more interesting, the installations (in the multiple hundreds) do now have Internet connections, so there is no way to call home to send and receive information or for remote support, maintenance and management.

For our Rwanda implementation, in lieu of building additional monitoring directly into the sync module, we decided to move this monitoring into a separate module, and then to expand it beyond sync to enable general monitoring of a network of OpenMRS installations. This is still in a beta stage, and there may still be some non-generic code remnants, but the code is available here (with a README describing the intended design) for anyone who is interested:

This enables each individual server to collect and store historical information about itself, and also to communicate using web services to a central “parent server” that can provide an overview of all of the servers in the network. The idea is for this to collect a standard set of system information (platform + openmrs + module versions), usage information, and to be extensible - enabling the set of metrics to be customized.

I’ll let @rubailly and @cioan comment further on the current status of this module for PIH.

Mike

3 Likes

I think this is a discussion that will interest a lot of people. I hope you all think it’s okay if I submit a session idea for it at the summit.

1 Like

@pascal Excellent

For Rwanda I was working on very basic system indicator report. One key point was that some/most servers didn’t have connectivity and we wanted to have a simplistic approach to routinely collect some data (via monthly paper printouts).

I’ve attached most of the data points below, but among the most important ones were uptime during clinic hours (separated by general system and OpenMRS uptime) as well as number of restarts and crashes.

This can help to get a sense on how stable the environment is; in our use case the servers were often shut down outside of clinic hours and power not always reliable.

I think this project evolved beyond my knowledge, so I’ll let @hamish and maybe other PIH’lers jump in in case there was further development (maybe towards the EMR Monitor Module @mseaton mentioned earlier).

Primary Clinic Days: Mo,Tu,We,Th,Fr
Primary Clinic Hours: 800 - 1700
Start date: 04 Aug 2014
End date: 08 Aug 2014 (including)

Percentage of system uptime (1): 98.89 %
 This week: 92.31 % (11 Aug 2014 - 17 Aug 2014)
 Last week: 98.89 % (04 Aug 2014 - 10 Aug 2014)
 Last month: 94.32 % (01 Jul 2014 - 31 Jul 2014)

Number of system starts (2): 3
Number of system starts without preceding shutdown (aka crashes) (2): 1
Times of last system crashes (approximation) (2): <to be implemented>

Percentage of OpenMRS uptime (1): 98.89 %

Number of Encounters (3) - (4): 307 - 66253
Number of Obs (3) - (4): 1183 - 275079
Number of users (3) - (4): 0 - 48
Number of active patients (3) - (4): 2 - 628
Number of new patients (3) - (4): 5 - 3561
Number of visits (3) - (4): 308 - 63046

Last local OpenMRS backup (5): <not found>

---
(1) during clinic hours between start and end date
(2) between start and end date (incl. outside of clinic hours)
(3) new during period in OpenMRS database (not voided or retired)
(4) total ever in OpenMRS database (not voided or retired)
(5) in /var/backups/OpenMRS
2 Likes

Thanks Christian for describing this project. It has indeed grown into some important new initiatives. One is with PIH for monitoring OpenMRS installations at a variety of sites. The second is as part of the large CDC funded evaluation of OpenMRS rollout in Rwanda to support HIV care, which is led by the MOH. In the latter case the software is being adapted to the specific OpenMRS setup used by the MOH and will run on OpenMRS 1.6 as well as 1.9 and will collect key study variables to let us measure system performance in real time and upload that to the National DHIS2 system.

My goal with this project over the last 4 year and more has been to develop generic tools to allow us to “instrument” OpenMRS installations and monitor their stability, performance, data quality, and usage. So we plan to have at least one version for general OpenMRS use and we would interested in input on requirements and potential uses.

1 Like

We are developing a metrics plan for OpenMRS for CY 16. Part of the proposed metric plan includes looking at these proposed measures of the application ( as well as the network)

Availability of automated tools for system monitoring and performance (sounds like this exists as the EMR Monitor module?); if yes, then do the the measures include:

down time (or uptime) response time latency additional performance metrics

thanks for any insights into this. @terry

1 Like

Definitely @cine’s (@PIH’s) first invention has evolved far better by now after landing into @Jembi’s hands than probably was envisioned. We have taken it up and split it into two projects, EMT backend and EMT front end OpenMRS module which depends on EMT backend but is not required.

  • The backend has now been modified and distributed as a two commands installation Ubuntu package which allows configuring it to monitor any OpenMRS instances on the same server it’s pointed to.
  • The tool is now integrated with DHIS2 enabling it to report through DHIS goodies (graphs, reports etc)
  • It now can monitor more instances at various locations such as the whole district or country and report to one central DHIS instance
  • The soon Rwanda CDC study which is getting it to monitor over 200 sites will most likely result into proposition of more indicators to report
  • The Tool now automatically reports data into DHIS on a daily basis and if the instance being monitored is out of Internet connection, it waits until it detects a connection and sends the stored monitored data for all missed days which means a facility that has one day internet connection per month is as well monitorable.
  • ETC

@k_joseph any chance of the module working on windows? Can the statistics be collected and saved locally, then maybe exported via CSV or Excel without requiring Ubuntu?

1 Like

We are strictly requiring Ubuntu since this is what the Rwanda folks desired to have supported for now, we have plans to make this tool cross platform independent. Saved data is stored until a connection is detected within a period of one year and it’s in json format which can be transformed to any formats of your preference. Here is the documentation which alone one needs to install and setup the tool. EMT Backend Documentation - version1.1.2.pdf (15.6 KB)

@mseaton, wondering how different is https://github.com/PIH/openmrs-module-emrmonitor with what we forked and started with!!!

@k_joseph - good question. I think there is definitely some overlap. One of the big initial differences is that the emrmonitor module is designed to communicate monitoring data to a centralized server, where it is stored in raw (json) form and can be viewed over time and compared and aggregated between instances, whereas my understanding of the emt is that it (at least previously) simply produced static PDF documents that could be shared. But I would welcome a more thorough comparison of both approaches and to see if there could be room to merge the two in some way.

As far as uptime, latency, usage, and similar performance stats go, I’d assume a module like this would produce a Nagios-friendly plain text page that could either be monitored by a Nagios server or some other commodity infrastructure monitoring tool.

Hi Everyone,

We have a team member, Femi, who is planning on working on remote system monitoring for our iSantéPlus project in Haiti. Ultimately, it seems like a good fit to push this information in to DHIS2. I see the recent work done by @k.joseph at Jembi and wonder if the openmrs-module-systemmonitor is the way to go. @k.joseph can you provide an update on this module compared to the EMT modules that you used to work on? Why did you choose to move away from them?

Thanks, Craig

Hi @craigappl, I’ll let @k.joseph comment on the status of systemmonitor, but if you want more information on the EMR Monitor module that PIH has started using, let me know and I’m happy to give you an overview. Mike

openmrs-module-systemmonitor module originally supported the following indicators;

  • Server Id (HostName-MacAddressWithoutcolons)
  • Processor (e.g. Intel(R) Core™ i5-5257U CPU @ 2.70GHz)
  • Server Uptime (minutes)
  • Operating System
  • Operating System Arch
  • Operating System Version
  • Java Version
  • Java Vendor
  • JVM Version
  • JVM Vendor
  • System Language
  • System Timezone
  • Java Runtime Name
  • Java Runtime Version
  • System DateTime
  • File System Encoding
  • User Directory
  • Temporary Directory
  • User Name
  • OpenMRS App Name
  • OpenMRS Version
  • OpenMRS Uptime (minutes)
  • Installed Modules
  • Date for last backup
  • Server’s Real Location
  • Total Memory (MB)
  • Used Memory (MB)
  • Free Memory (MB)

in the last development timeline we added more indicators listed below for a Rwandan SPH study for which the module is currently implemented in production;

  • active patient - 8m
  • active patient - 8m - CD4 (EMR)
  • active patient - 8m - VL (EMR)
  • active patient - 20m - CD4 (Last Year)
  • active patient - 20m - VL (Last Year)
  • Initial viral load
  • Initial CD4 Count
  • Followup CD4 Count
  • Followup viral load
  • OpenMRS Uptime (%)
  • Number Of OpenMRS DownTimes
  • OpenMRS Downtime (%)
  • OpenMRS Downtime (minutes)
  • OpenMRS UpTime Intervals
  • OpenMRS DownTime Intervals

For more details about each of the indicators such as a description of what is monitored consider looking into the documentation at JEMBIDEV-OpenMRSSystemMonitorModuleDocumentation-200417-1025.pdf (757.3 KB)

I recommend setting up the module using the documentation, point it to a different DHIS2 instance and see if it meets part or all of your requirements, we can help direct you in-case you need to make any logical changes in the metrics etc.

We chose to move away from an initial script we had started supporting cloning it from PIH tools since it was Ubuntu requiring and we needed to do more with the tool, the script had direct openmrs database access which was a security threat and so we chose to re-write an OpenMRS module that does exactly and more than the script was doing.

Thank you both for this information!

We’ll review it over the next few days and connect Femi for further information.

Craig