Summary: Progress on Support for Cloud, Clustering, and Horizontal Scaling

Many Implementers have asked for better support in the OpenMRS EMR for Cloud Hosting, Horizontal Scaling, and Clustering.

  • Scaling OpenMRS for big programs has been difficult and manual: from scaling to managing, upgrading, and monitoring dozens of instances all deployed manually.
  • We didn’t have out-of-the-box support for clustering, multi-tenancy, or high availability, and due to our Implementers’ different needs, we needed to be able to accomplish that with an on-premise solution and not only by using cloud providers like AWS, Microsoft, etc.
  • Hosting was often limited to a single EC2-style instance, making performance and uptime harder to manage. YAML-heavy Kubernetes setups were too complex for many Implementers.

We’ve made remarkable progress over the last year - here’s where we are!

:white_check_mark: Why support cloud, clustering & scaling?

1. Increased reliability & uptime

  • Cloud or Kubernetes clusters let you run OpenMRS across multiple servers. If one node fails or is updated, another instantly takes over—so clinics stay online.
  • Rolling upgrades mean no downtime during updates or maintenance.

2. Better performance under load

  • Horizontal scaling allows you to run multiple OpenMRS application instances behind a load balancer—great for larger facilities or multi-site deployments.
  • Databases and indexing (e.g. ElasticSearch, MariaDB replicas) are now set up to scale, improving search and data throughput.

3. Consistent setups, less DevOps burden

  • Standardized deployment guides (AWS EC2, Kubernetes/Helm, Docker, OpenShift) let teams follow tried-and-tested configurations rather than reinventing setups.
  • Terraform + Helm charts mean you can spin up reliable, cloud-agnostic environments uniformly across regions or data centers.

:rocket: What’s new in Platform 2.8 to support Cloud & Scale

Feature What it does Why it matters
Infinispan (distributed cache) Replaces EHCache with a clustered caching layer for Spring/Hibernate Ensures consistent caching across app instances, improving performance and reliability.
ElasticSearch cluster Replaces in-memory Lucene search with ElasticSearch/OpenSearch Enables fast, reliable full-text search across multiple replicas and pods.
StorageService Unified interface for file storage (disk, S3, or plugin extensions) Separates file storage from the application, so we don’t rely on local disk storage, making it safe for clusters and resilient across nodes thanks to replication and automated backups/versioning.
Horizontal scaling support Dozens of smaller features (liveness check for Kubernetes, setup and runtime configuration via environment/system properties in addition to files, DB replicas support) Teams can deploy multiple O3, DB and ElasticSearch instances behind load balancers or stick with single-instance setups as needed.

:blue_book: What this wins for Implementers

  • Reliability: no single point of failure
  • Performance: can handle increasing loads or multiple clinics/sites
  • Portability: deploy consistently across AWS, Azure, GCP, government/private clouds, or on-prem Kubernetes
  • Future-proofing: support clusters and multi-tenancy

:+1: Try it today!

  • :right_arrow: Platform 2.8 is in alpha now. You can test it via the latest snapshot: Platform 2.8.0‑SNAPSHOT. We plan to release an official version in 1-2 months (see this post).
  • :right_arrow: Use the Helm charts from the openmrs-contrib-cluster repository on Kubernetes or Docker compose
  • :right_arrow: Explore this 10‑minute demo video showing O3 launched on Kubernetes with a single command.

:folded_hands: Thank You & Next Steps

  • Huge thanks to Rafal (@raff) for driving these changes and diving deep into Helm, Infinispan, ElasticSearch, storage abstractions, horizontal scaling, and more.
  • We’re really grateful for our community code contributors who helped in this work: pidsonn (TRUNK-6318), Bhargav Kodali (TRUNK-6306, TRUNK-6347), Herman Muhereza (TRUNK-6334) and @dkayiwa for diligent and fast reviews!
  • Special shout-outs to all implementers who’ve shared their requirements and pain points around cloud and scaling, including @PalladiumKenya, @egpaf, @Intellisoft; @jecihjoy @aojwang @moshon and others. We love hearing from you, keep it coming!!

Want to help or get involved?

  • Reply here, or tag your questions with cloud in other Talk posts; or, join a weekly Wednesday community Platform Team call to talk with other engineers (call details on the calendar at om.rs/cal)
  • Try the Platform 2.8.0-SNAPSHOT and share your experience.
  • Deploy using Helm/Terraform and let us know any gaps.

More details are available in our Cloud Hosting guidance wiki here.

9 Likes

Thanks @grace for a great summary and everyone who contributed! I wouldn’t have pushed it that far without all of you.

I’d add that we are also working towards making backend changes in O3 modules so that they are compatible with openmrs-core 2.8.x.

And there will be more work needed in some O3 modules to make use of the new features such as StorageService and caching.

For anyone interested in contributing please reach out to me or pick up any issue at TRUNK-6299.

3 Likes

@grace @raff sounds great, I am excited to see OpenMRS make these Cloud oriented improvements.

We will try to deploy on AWS and share the experience.

1 Like

That is awesome @pgesek, thank you so much! Really looking forward to your feedback :star_struck: