Gsoc 2020: Expose System Metrics For Monitoring.

@mseaton @ibacher @cintiadr @judeniroshan

What do you think about these two tools that @cintiadr mentioned I think both are general-purpose metric instrumentation libraries

I can see we can expose metric reports as JMX, CSV and JSON objects in Dropwizard I know these ways of exposing already enough for most of the monitoring tools which are currently in use. But it seems Micrometer has more support to monitoring tools which is a plus point for it?

  1. Drop Wizard :- https://metrics.dropwizard.io/4.1.2/manual/servlets.html#manual-servlets
  2. Micrometer :- https://micrometer.io/docs/concepts#_purpose

Also Micrometer does seem to be being adopted by Spring (in the form of Spring Boot) which might be a good argument for it. I’m pretty agnostic about the framework chosen, though; just that we aren’t reinventing the wheel.

3 Likes

+1 to what @ian said. I’d probably lean towards micrometer for the reasons above, but I have done no actual analysis on either.

1 Like

Thanks all for the feedback :slightly_smiling_face:

Hi @ibacher @mseaton @mozzy @judeniroshan

I was looking at the atom feed module to understand how it captures events related to object types.

I actually found a HibernateInterceptor which capture the db events and serveEvents for the atomfeed module.Which actually amaze me a little. Because I thought the events were served based on the Events module activemq message broker.

Few questions I have

  1. Is Events module actually engaged in the atomfeed module if so where is it ?
  2. I think up to saving to the event to the db we can use the same flow for our module. Because it has a mechanism to queue the events and do the db save transactions one by one. I think that is a good advantage for us since for a module that captures data points It’s a must to make sure not to lose any data points at any point. What do you think the are we going to use the same flow till saving the event details in the db. The events table looks like below.

For the data points, that we are going to add additional to what already captured by atomfeed can be extended easily with the way they have developed it.

         1 Include the openmrs class of the object type in the atomfeed config.
         2 If there is no subresource builder make sure to add it. 

Then it will start capturing the events of the new class that we included.

But one thing I noticed it’s using transaction classes in ICT4H commons module. I am not sure whether this is good or we have to migrate them to openmrs organization.

  1. What are the data points we can capture other than what atomfeed does?

Here is the list of classes that are included for now.

I see some of them as enabled : false any paticular reason ?

  1. Why do we have api , api 1.9 , api 2.0 separations ?

Hi @ayesh!

Thanks for your digging on this.

Yes, the Wiki says. that atomfeed runs on the Events module, but actually it looks like that was never implemented. The Hibernate interceptor is basically identical across the two except that: in Events it uses the afterTransactionCompletion() event and checks that the transaction was completed before firing notifications, where in Atom-Feed, it uses the beforeTransationCompletion() event and seems to generate events regardless of if the transaction was committed or not (the change was made in this commit). This seems to have been related to some funkiness with accidentally creating a sub-transaction to store data in a the feed, but there’s a lot more in this Talk post. It appears that the future state Darius alluded to was never implemented. That might be worth reviving.

Questions:

  1. Apparently, no. I still think having a single solution here is worthwhile, but apparently the event module could use some attention.

  2. One of the advantages of using the events module and ActiveMQ is that we don’t have to implement custom queuing. I would think either migrating ActiveMQ in the events module to use the database for storage or building off the AtomFeed module are the right ways to address this.

  3. AtomFeed’s main use-case (as I understand it) is for sharing data among the various components of Bahmni and as part of Sync 2.0. The classes excluded are probably not commonly shared via those interfaces (admittedly Order is a bit of a weird one).

  4. Backwards compatibility? api-1.9 is only applicable when run in versions < 2.0.0; api-2.0 is applicable in versions > 2.0.0.

1 Like

Hi @ibacher @mseaton

Yeah it seems in atom feed we have beforetransaction() to fire the events.

But when it comes to serving from activemq we have a small problem again beforetransaction() will fire when an event happens and it will be synchronously delivered to subscribers. And I saw darius has mentioned we need a call back function to make sure this transaction actually committed. I think this is a call back we need.

@mseaton in this ticket has mentioned If there is a compelling reason to make the callbacks after transaction completion then I’d think the workaround would be to fix the service method that saves the feeds to be annotated with @Transactional and the propagation attribute set to REQUIRES_NEW and let spring deal with the transaction management as opposed to doing it manually like the current code

But if we are moving with events module and ActiveMQ will it be still possible to depend on the @Transactional annotation ?. In a module where we use events module to get events (Similar to atomfeed) but depends on events module and activemq. Will this make sure only the committed transactions save to the feed ?

Hi All

Any idea about this ?

cc :- @ibacher @mseaton @judeniroshan

@ayesh I think it depends on our use cases, the way we are handling the events, and what we are doing with them. My advocacy in the past for moving event firing into the beforeTransactionCompletion was due to use cases I saw the event module being used that involved synchronous listening for events and performing additional operations within the OpenMRS system (eg. module A saves an object causing an event to fire which then module B listens for to save additional data related to that object). In this case we want all of these save operations to save or fail as a unit. In other cases, asynchronous handling is preferred.

I think the question is more around what we are trying to monitor, and how will that data be consumed. Based on that, we can determine if we need logging to be done within the transaction, outside it, or both.

My guess would be that for monitoring purposes, we mostly care about data after the transaction is committed and any data related to monitoring would likely want to be stored separately from the data (i.e., I would lean towards failing to log the monitoring data rather than failing the main transaction).

@ibacher I actually agree with you I think we should get only committed transactions.

Btw as @mseaton mentioned can we define a set of data points that need to be monitored?

I see most of the data points are already been saved from the atom feed module to the events table.

For example :- No of parents registered to the platform within given period (can be taken with a query from events module). But the problem will be the performance again. Since we are dealing with sql queries.

@ibacher @mseaton @judeniroshan

Hi all

Since we are working with transactions completed we can’t move forward with atomfeed module.So I thought of having a look at the EMR-Monitor module. I locally build it had lot of bean creation issues.So I unwatched the module from the server that I am running currently.

But the weired thing is after I unwatched and start the server agian from sdk still it’s copying the .omod of emr-monitor module.

And also it affects my sdk as well. When I try to create a new server first thing sdk does is saying found openmrs module. Copying from local maven repository to the new server. So the error keeps popping up on new servers as well.

Finally I deleted the repository I downloaded as then it removed the emrmonitor dependency from the sdk: setup.

Hm with all other modules when we unwatch it there is no more intreaction with the module even on restarting the server. How can we fix this issue @mseaton ?

Hi @ibacher @mseaton

Error log for youre reference https://pastebin.com/G3LQqr9V

In your hibernate mapping file, can you replace org.openmrs.util.HibernateEnumType with ’ ‘org.hibernate.type.EnumType’?

1 Like

Yes I fixed it will send a PR.

thanks alot @dkayiwa :smiley:

So, I think we’d want to be able to capture things like:

  • # of users logged into the system for a period and also aggregates like min, max, and mean simultaneous users (this is likely to be somewhat different from # of sessions)

The following types we would want to be able to disaggregate these by user, user role, encounter role and location:

  • # of patients registered for a period with aggregates for min, max, and mean
  • # of patients voided for a period again with aggregates for min, max, and mean
  • # of encounters by encounter type for a period with aggregates for min, max and mean
  • # of forms used by form type for a period with aggregates for min, max and mean
  • # of obs created by obs concept

I realize this might be a large target, so I would suggest that if a framework can be built around # of patients registered and # of encounters by encounter type, that’d be sufficient for GSoC (at least at an initial stab).

@mseaton @hamish Anything high priority I might’ve missed here?

Hi @dkayiwa @mseaton @ibacher

Have a small issue with the test cases.

I changed the type to org.hibernate.type.EnumType

And .hbm.xml file looks like below now.

<property name="status" column="status" length="50" not-null="true"> <type name="org.hibernate.type.EnumType"> <param name="enumClass">org.openmrs.module.emrmonitor.EmrMonitorReport$SubmissionStatus</param> <param name="useNamed">true</param> </type> </property>

For the test cases It inserts below raw in to the in memory db

<emrmonitor_server id="1" uuid="e6e492e5-4321-11e6-be45-e82aea237783" name="Rwinkwavu" server_type="1"/>

But during the insertion time it gives below error

org.dbunit.dataset.datatype.TypeCastException: Unable to typecast value <LOCAL_ONLY> of type <java.lang.String> to INTEGER

I can see in the mysql db it’s been created correctly as a varchar data type but when it’s inserting in to local db for testing I am not sure why it’s giving this error.

Hi @ayesh,

Good job finding out the surrounds for the project. I wonder what benefit we have using an event queue instead of just querying out from the DataBase.

In fact I have developed such a reporting module back in 2017 as a GSoC project. These metrics are quite similar to what has already implemented on this module.

https://wiki.openmrs.org/display/projects/Built-In+Reports+-+Reference+Application

How would it be different from this module what we already have?

@judeniroshan As I read the purpose of this GSoC, it’s less about generating the information than it is about getting the information into a standarised format that can be consumed by existing monitoring tools. This is especially relevant for monitoring these kinds of data across multiple implementations of OpenMRS or multiple deployments that might be running in different locations across a country.

Incidentally, the event-based approach @ayesh has been talking through will yield somewhat different counts than retrospective SQL reporting might. Retrospective reporting is probably enough for generating reports that are fed upstream to ministries of health (hospital x had 300 new patients this month), but are a less adequate when looking at system usage, which can be asking different questions (hospital x added 300 patients this month, but in the process they created 375 patients and voided 75 of those). In the latter case, this brings out that there is some underlying issue that may be causing the workflow at the hospital to be less efficient than it might otherwise be. Is this because of staff training? Poorly designed forms? Is this within the expected range of human error?

Hope that clarifies at least the value I see in this proposal.

@ibacher @mseaton @dkayiwa

Any idea about the error.