Cluster and cloud support for OpenMRS

raff · May 9, 2024, 10:59am

We are evaluating providing documentation and tooling for deployment of OpenMRS to Kubernetes.

Kubernetes is open-source and supported by all big cloud vendors such as AWS, Azure, Google Cloud, smaller vendors and can be deployed on premise.

We will not be providing guidance on deployment and management of Kubernetes itself.

The proposed solution for Kubernetes will include Helm charts for production grade deployments of OpenMRS 3. They will be cloud vendor agnostic i.e. they will include minimal setup for deploying OpenMRS application, database and http gateway. The database and gateway will be optionally replaceable by services hosted by vendors.

A single Kubernetes cluster should be able to handle multiple deployments of OpenMRS with shared pool of VMs to support multi-tenancy in a scenario with one OpenMRS application container per tenant and a possibility to share database engine, but each tenant using a separate schema.

Container cluster such as Kubernetes provides us heavy lifting for high availability, better resource utilization than dedicated machines or VMs, logs aggregation and monitoring, upgrades, cloud compatibility.

Initial features will include:

DB and DB read replica configured out of the box with automated backups.
HTTP gateway for serving frontend and backend requests
Health checks and automated restarts in case of failures
Aggregated logs and metrics for OpenMRS service and DB.
Rolling upgrades of containers

Prospectively we envision providing:

Support for deployments with DB engine shared between OpenMRS deployments, but separate schemas (to support multi-tenancy for smaller facilities at lower cost)
Support for OpenSearch indexes for patient and concept search for improved speed and HA.
Support for OpenMRS service replicas with load-balancer in front for high availability and performance.
Maintenance pages for upgrades
Error logs notifications

Proposed vendors checklist for running a single deployment of OpenMRS:

Kubernetes version 1.29
Min. 2 VMs with 1.5+ GHz CPU, 2+ GB of RAM, 50+ GB of SSD (DB, DB replica, OpenMRS instance, http gateway)
Optionally (recommended) vendor hosted MySQL 8.x with automated backups.
Optionally (recommended) vendor hosted gateway with load balancing e.g. ELB (AWS), Gateway Load Balancer / Front Door (Azure)

We are sill in the evaluation phase and are open to go with any cluster technology and architecture so your feedback, experience and usage scenarios is invaluable. If there’s anything that you would like to accomplish in not so distant future, but are not sure how to go about it e.g. multi-tenancy, specific reporting needs, synchronization, geo-distribution, other concerns, please throw these on us and we can continue to refine the proposed architecture.

See the 03 Cluster and Cloud Deployments wiki page

raff · May 9, 2024, 11:05am

@burke, @cintiadr, @ibacher, @scott does jetstream provide any container cluster to use for testing purposes? Would we be able to get Kubernetes on AWS for testing (maybe using some of the AWS grant)?

scott · May 9, 2024, 12:19pm

Hi @raff I think jetstream supports containers and kubernetes. I just looked up the docs and found this Kubernetes - Jetstream2 Documentation

With enough AWS credits from the grant, EKS is ideal, only downside would be cost.

ibacher · May 10, 2024, 8:06pm

@raff Do you have an account in our secondary AWS site? If not, I can probably set you up with one.

As @scott said, it’s possible to run a Kubernetes cluster on Jetstream (it’s just OpenStack), but there’s no managed option for a Kubernetes cluster, i.e., we’d have to build our own.

raff · May 16, 2024, 9:41am

Thanks @ibacher. I’ve just e-mailed you regarding the account.

I should have mentioned in my initial post that the cluster route is only for some implementations and shall not replace the simple setup with docker for all smaller implementations running on-premise that we will continue to support.

@mseaton, @mogoodrich do you see the need for Kubernetes in any of PIH deployments? I recall you did have to deal with some huge DBs in your deployments. Have you ended up introducing DB clustering and replication in any scenario? Do you run any deployments serving many facilities from one datacenter?

burke · May 22, 2024, 2:36pm

Thanks for leading this and laying out a road map, @raff!

It’s important for OpenMRS to be cloud-ready/cloud-friendly and the community will be well-served to have a consensus/best-practice approach to lower the barrier and learn from each other. Ideally, we would use this in our demo environment (for testing and avoiding regressions in our CI pipeline) and we’d have examples of its use in the field by one or more early adopters. It would also be great if this approach could serve the needs of commercial solutions like Ozone, so they would be invested in the same approach.

While the resources used by or around OpenMRS (db, indexing, logging, etc.) likely already support clustering, one of the biggest challenges for truly scaling OpenMRS will be making the API/Platform capable of horizontal scale – i.e., being able to run the OpenMRS service layer across multiple JVMs. I’m assuming this will require an approach similar to tools like Confluence (e.g., shared home folder) and guidance (like this) for core and module developers to properly handle caching, tasks, locks, events, files, networking, etc. … along with refactoring existing core/modules to follow these clustering practices.

But, as you’ve outlined, there’s a lot we can do to support scale even while the OpenMRS service layer itself is not yet clusterable (horizontally scalable).

FWIW, the Jetstream2 documentation does have some information about deploying Kubernetes.

raff · May 24, 2024, 3:51pm

Thanks for contributing to the discussion.

I had a quick call with @Mekom today to understand their deployments and challenges. Thank you again for letting us tap on your experience!

The team had even attempted Kubernetes deployment with Helm charts, but it was a misfit for the hardware setup they had available (Raspberry PIs) and the course was changed.

My key takeaways from the call and the comments above are:

The Mekom team invested a lot of effort in running in cloud. They developed Ansible and Terraform scripts and put in place CI/CD pipelines with Jenkins. They benefit from all that now, which makes them less inclined to move away and switch to any cluster container. As it happens with pioneers they implemented a lot of this by themselves and our goal is to make such setups reusable and shorten the path for everyone.
Having aggregated metrics, logging and errors would be highly beneficial. Mekom came up with their custom setup based on Prometheus, but we can certainly provide that out of the box running in Kubernetes and make it available for everyone. It would not only help implementers, but also the community developers to address any issues as we often lack the bigger picture of a live system and its state at the point of performance degradation or failure.
There is a shared belief that horizontal scaling of the OpenMRS service would improve performance. I would take a step back and say that we first need to understand where the performance bottlenecks are. In an application such as OpenMRS where most of the work is persisting and getting data back from DB it is most often the case that tuning the DB by having the right indexes that fit in available memory and the right SQL queries and schemas is the key. Next comes clustering of the DB itself and having more read/write instances, running full text searches outside of the main OpenMRS instance in a dedicated service such as OpenSearch possibly with replication, caching, cache replication and only further down the road kicks in the actual app replication and load-balancing. In order to make informed decisions we need better performance monitoring of a real-life system, which takes us to the 2nd point. It includes alerting on long running DB queries, slow REST endpoint responses, excessive memory and cpu usage, etc.
Automation and ease of deployment is also the key. The manual steps required for the initial deployment and subsequent upgrades should be minimal, thus the use of Terraform and Helm charts.
Kubernetes is best for those hosting in cloud or proper data centers. I’d recommend others to shy away and stay with docker-compose deployments.

I invite you all to have a look at the roadmap. I’d appreciate further feedback and comments.

I realize it may be a big lift for any implementation to roll out a new deployment approach and there must be enough benefits to even try it out. I am hoping that the proof of concept will eventually spark your interest and that we can dive into this territory together. I believe it’s not a matter of if, but when OpenMRS implementers hosting in cloud and data centers start moving in that direction.

burke · June 26, 2024, 4:00pm

Thanks for the thoughtful discussion here, @raff. I agree that there’s undoubtedly a lot of scale we could get out by properly tuning the parts around the OpenMRS code (db tuning, db clustering, caching, etc.), but I’m also convinced we are going to hit a bottleneck with the inability to cluster our API (i.e., the inherent assumption that OpenMRS is running in a single JVM). While improvements in db, searching, caching, etc. as well as a bigger server (RAM, CPU) for the API can get you pretty far, ultimately, we need to be able fire up 2, 4, 8, or more instances of OpenMRS against the same database. If an implementation is overload, one of the simplest solutions (for an implementation) would be to fire up more instances of OpenMRS to share the load.

My expectation is that there are a lot of assumptions in our codebase that there is only one running instance of the JVM (state kept in memory, use of file system in ways that wouldn’t support parallel use, etc.). It would be great to see a path toward a future where these assumptions were no longer made, “truth” is always in the database or in clusterable resources, and devs (both for the API and for those creating modules) would be provide utilities, guidance, and example code to write code that can support a clustered world.

raff · July 2, 2024, 11:28am

There are a number of changes to do to accomplish what you are referring to:

Use Solr instead of embedded Lucene for searches.
Use distributed cache for Hibrernate and Spring instead of in local memory.
Configure to store http session information in distributed cache.
Identify places in code that do caching by other means such as HashMaps and use distributed cache instead if relevant across all instances (applicable to user moving between instances).
Identify places in code that store files in the file system and use a distributed file system instead.

4 and 5 require changes not only in openmrs-core, but also in modules. We need best practices and migration instructions for them.

1, 2 and 3 are already listed in the roadmap. 4 and 5 are under Future phases in 3b.

angshuonline · July 4, 2024, 1:27pm

@mohant maybe good to touch base with Raff and others and share our experiences and opinions with Bahmni, and thoughts on multi-tenancy

mohant · July 9, 2024, 3:42am

Brief thoughts on Infra Automation and Kubernetes setup from Bahmni Team.

We have developed Terraform Scripts along with associated Github Actions to completely provision the infrastructure on AWS which includes VPC, EKS, RDS and other necessary components. The repository is available here.
We have also developed the helm charts for all of the services of Bahmni Lite. (WIP for Bahmni Standard services). Helm configuration which runs the OpenMRS service can be found here. For local developement and testing we have used Minikube to deploy the configurations.
Since Bahmni runs with multiple services, to make the deployments easier, we have taken the helm-umbrella-chart approach to have sub-charts as dependencies.
For Monitoring stack we have leveraged the kube-prometheus-stack maintained by Prometheus community which comes with a decent set of pre-configured Dashboards and Metrics.
For Centralised Logging we have seen PLG stack to be light weight and we are using the loki-stack from Grafana Community

Challenges that we faced during the configurations:

Use of EFS based Persistent Volumes: Initially we started with EBS based persistent volumes which had issues when we had to scale nodes across AWS Availability Zones. So we moved over to have all the PVs with an EFS based storage class and driver.
Scaling of more than one pod for OMRS: As pointed out by @raff already, session management was difficult when we had to run more than one replica of OMRS service. So far we have been running with a single pod.

There has few discussions about multi-tenancy already with Bahmni and OpenMRS. We have identified 5 high level areas. Please refer this talk thread.

cc. @angshuonline @rahu1ramesh @raff

janflowers · August 19, 2024, 6:43pm

@raff @caseynth2 @ibacher Checking to see if we’ve made some progress on documenting our OpenMRS guidance for Cloud and Clustering, and multi-tenancy, and any progress on strategy/roadmap to improve it for OpenMRS?

michaelbontyes · October 9, 2024, 9:18pm

Hi @raff,

Thank you again for today’s discussion.

Here is a high level summary of the current requirements for MSF/Madiro:

No downtime since using O3 at point of care in 24/7 facilities
On-premises (all nodes) and hybrid settings (for example, a facility that runs OpenMRS in the cloud-Azure but uses a local instance as a backup when temporary connectivity issues)
High availability rather than scalability = One master rather than multi-masters is ok
Replicas on local network and remote local through Internet
Automatic failover when primary node goes down (DBs and/or gateway with apps)
Automatic data sync and switch upon primary node recovery
Monitoring and routine testing

grace · October 11, 2024, 7:07pm

Fantastic summary Michael! I’d say this is very representative of the requirements I’m hearing from all other implementers I have spoken with.

raff · October 22, 2024, 11:43am

First of all big thanks to the Bahmni team for sharing your work! Thank you @mohant and @angshuonline for making me aware of it.

I’ve incorporated much of your terraform into our repository at GitHub - openmrs/openmrs-contrib-cluster: Contains terraform and helm charts to deploy OpenMRS distro in a cluster I tried to attribute the work to Bahmni, but please let me know if it needs any adjustments.

The progress so far:

We have a helm chart, which deploys O3 together with MariaDB with a read-only replica on a vanilla Kubernetes (self hosted or from any cloud provider). I still need to add more documentation around it and list all possible configuration options (e.g. using RDS instead of self-deployed MariaDB, switching to MariaDB cluster with multiple masters, etc.)
We have terraform scripts for AWS to setup Kubernetes and RDS (with still a bit of work around volume provisioning) and install O3 helm chart.

This week I’m hoping to complete volume provisioning and deploy Grafana with Helm chart for monitoring and gathering logs from OpenMRS and DB.

As we discussed on the call with @michaelbontyes I reckon that MSF’s requirements are best addressed by the Kubernetes based deployment and MariaDB in a cluster mode.

We could satisfy a large portion of them with the O3 helm chart already.

Kubernetes can be configured to use nodes that are running in Azure (or any other cloud provider) as well as on premise (local or remote network).

The “no downtime” can be achieved for DB cluster at this point (may need just a few changes in openmrs-core DB config to support DB cluster), but the O3 backend may experience downtime with automatic failover and recovery under a few minutes. Having another replica of O3 instance running is problematic at this point, but we will continue working towards that as well.

raff · December 19, 2024, 1:53pm