New Cohort Module in 2.1.x introduces performance issues

We’ve ran into an issue with the new Cohort model in OpenMRS 2.1.x, specifically the new CohortMembership model.

Making a Cohort a set of objects (“CohortMembership”) has seems to have significant performance implications.

For instance, adding 60,000 patients to a Cohort takes several minutes, as opposed to a few seconds that is did previously, because of this method:

This kills reports in the reporting module that use the AllPatientsCohortDefintion.


Take care, Mark

My vote is to revert to the old model. And introduce the new features that were initially introduced in 2.1 in a way that does not impact the old model.

I vote the same way. :slight_smile:

Same thing here :grinning: +1 to reverting to the old model

Reverting a feature that was just released in the latest OpenMRS platform release would be unprecedented as far as I know.

That is not to say that we can’t do it, but voting here is not going to be what makes it happen. What would make it happen is someone taking the lead and making a plan which includes:

  • release plan (is this an out of band release? does it require a major version bump? or what?)
  • communications (and determining who would be affected)
    • starting with a new post in the #dev category proposing this
  • tech assessment if relevant
  • and doing a straw man of defining a repeatable process for this

FYI Mike has done what looks like a quick fix that completely resolves the performance issue here:

I’ve confirmed that @mseaton’s commit does fix the performance issues we were seeing.

I realize that the barrier to totally remove an existing feature may too high, but it’s clear that this feature could do with a good design and code review… you’ll see some further comments from @darius on TRUNK-5375 to this effect.

Seems like something for a design call, or a detailed code review?

Would be good to have the original requesters of this feature involved so we understand the use cases… I think this is @burke and @ayeung? The “membership” seems to provide functionality similar to what program enrollment does, but I don’t know the original intent.

Take care, Mark

I haven’t been looking at this as “reverting a feature”, but rather “fixing a backwards incompatibility” that was introduced in a non-major version change. @mogoodrich highlighted a couple of cases where method signatures have changed between 2.0 and 2.1 I believe, and functionally things are different.

What I would think might make sense would be to aspire to maintaining full backwards compatibility, performance, intent, etc. of the Cohort object from 2.0 and before, and to maintain the same functionality as we introduced into Cohort in 2.1 in a new subclass called DynamicCohort or DateRangeCohort or something.

For 99% of implementations and downstream code, this will be a bug fix. For the small number of downstream modules or implementations that use the new features of Cohort, they’d need to update things to use the Subclass instead.

That said, neither Mark nor I will have have the bandwidth to lead this in the near term, and we have found a way to mitigate the current performance issue which makes it less critical (until the next issue crops up). But this is what I was thinking…


Just for reference to expand upon the method signature changes @mseaton mentioned above: I noticed that the “removeMember(Integer)” was removed between 2.0 and 2.1, and the addMember(Integer) had it’s return type changed from void to boolean. (Neither are blockers now since Mike got his fix in, but the return type change on addMember would have blocked me from easily overriding this method in PatientIdSet).

Take care, Mark

Hi, @mogoodrich,

The one mUzima Team created is called Expanded Cohort Module which is different from the New Cohort Module as at that time there was no New Cohort Module available and we needed it for our implementation. Thanks!

fwiw, upon further testing, @mseaton’s fix, while a huge help, does not completely resolve the performance issues… in 2.0, the setMemberIds method on Cohorts ran on the order of milliseconds, in 2.1 but prior to Mike’s fix it ran on the order of minutes… the fix gets it down to 1 to 2, which is fine if you are only hitting it a couple times, but when doing intersections/unions/joins via the reporting framework, you can hit this method repeatedly… will continue to brainstorm solutions and follow up…

Take care, Mark

@ayeung, didn’t you (and mUzima) design and trigger this work:

Am I correct in understanding that in the meantime mUzima has created a separate module, so it is not using this new code that’s in platform 2.1.x?

I’ve created two new tickets re: these issues, put them in design, and the marked them as community-priority:

Can we have a design call to discuss these at some point? A caveat would be that it would be good to have a consumer or champion for this feature on the call so we can understand the use cases when making our decisions, and I’m unclear who this person would be. (Otherwise I think we might design the feature away… :slight_smile:)

(Also, feel free to comment directly on the ticket with your thoughts).

@burke @jthomas @mseaton @darius @jdick @rubailly @ayeung @dkayiwa

Take care, Mark