Thank you all for attending! Thanks @grace for getting us all together. It was a very fruitful discussion and I wanted to share a few takeaways.
Among others we got a chance to meet Jake from AWS. He shared with us his work around AWS deployment of OpenMRS. I’ll ask him to share in-depth details, but for now please head to GitHub - Jmevorach/host-openmrs-on-aws-fargate and see the deployment of OpenMRS on AWS with ECS (AWS proprietary container cluster), Aurora Serverless as DB backend and Elastic File System as file storage. It’s a pretty straightforward deployment, which is fully automated with CDK (AWS proprietary tool). Jake experimented with running multiple replicas of OpenMRS instance, but as expected it didn’t work, because of how OpenMRS platform is implemented. Other than running multiple replicas we believe it would scale quite well given the Aurora as DB, which can probably handle more traffic than any known OpenMRS implementation can generate these days.
Although the minimal cost of running with the highly scalable setup is quite low (~200 USD/month) a simulation would need to be done to determine the actual cost for any implementation as it largely depends on how much data is being stored and retrieved from the system.
The challenges we see with running multiple replicas of OpenMRS have been already mentioned in this talk thread. The road there requires a significant effort, but it isn’t in any way unachievable in foreseeable future. Moreover, each step on its own will improve performance and scalability of OpenMRS. I’m thinking here of extracting full text search to dedicated service like OpenSearch and use of distributed cache so that OpenMRS instance has less responsibilities and the load can be distributed.
I will keep repeating myself that DB is the bottleneck in most implementations and if we can scale that and focus on SQL queries optimizations it should provide much better results than scaling OpenMRS instances in terms of performance. I think that we can run load testing on Aurora and see how much a single instance of OpenMRS can handle given the highly scalable DB storage. It can be seen on OpenEMR testing that Jake did, which is pretty much the same architecture as OpenMRS. See OpenEMR load-testing results.
As much as we can see that AWS computing resources make it easy to run OpenMRS at scale and handle lots of traffic we need a way to provide some of that on-premise. I believe that we have a consensus that Kubernetes is the right approach there. It isn’t the go-to cluster container for AWS as ECS is more cost effective and easier than running Kubernetes on AWS (EKS). On the other hand Kubernetes provides us a way to give implementations tooling that works regardless, if they choose to deploy on-premise or with a cloud provider.
We also touched on the topic of multi-tenancy. While some implementations need a multi-tenant environment where all patient data can be easily accessed regardless of the location, some are interested in being able to deploy independent instances that share just a portion of data (like patient index), but the benefit is that they can easily manage all those instances from one place, where you have experienced DevOps team that can act on any issues or apply upgrades and the hardware resources are shared and balanced between implementations that are smaller and bigger.
There can be also a hybrid approach where you have an instance used by all clinics in a single region that shares all data and other instances in different regions that share data only when needed, but they are deployed in the same infrastructure.
The benefit of hybrid approach is that you can better utilize your hardware (not all clinics require a dedicated instance), you are still more resilient to failures (as failure in one region does not affect others), patient data is not accessible by everyone and everywhere (lower security risk), you are more flexible and can move clinics to separate instances if they grow or combine if they shrink.
Another aspect that was mentioned again was that O3 increased bandwidth usage in Kenya by multitudes. The UI team will continue to look into that.
I also heard someone mentioned that implementations already invested in hardware on-site and it would be a loss to stop utilizing it. I believe that it could be actually a benefit and this hardware could act as Kubernetes nodes and one would be able to manage the nodes from one place given the network connection is available. It would be truly distributed computing power and storage, which is very resilient to hardware failures.
Hopefully, I haven’t missed anything, but no worries @grace should be able to share a recording of the call and I would encourage you to tune in and listen if you are interested in the topic.
Thanks again and I hope we can keep this momentum going!
Finally here is a handful of links on the topic so far:
Hello all! Was an absolute pleasure to be able to present to everyone on the architecture. Thank you for having me!
Happy to talk and/or collaborate with anyone interested. My work email is mevoracf@amazon.com and my personal email is jacob.t.mevorach@gmail.com and my Github handle is @Jmevorach.