Hello OpenMRS community! First of all be grateful to all the amazing work and contribution from @dkayiwa @ibacher @raff to the current state of the distro.
Some important topics related to OpenMRS GoLive based on hackathons sessions with other implementors and pilots feedback: I’ve gone through some other questions, posts, but didnt find information/documentation around approaches for creating snapshots and maintaining backups and, setting up GitHub Actions CI/CD pipelines for Ubuntu servers on-premise and Cloud providers, and implementing best practices for production with Tomcat, NGINX, and monorepos performance optimization. (the above tech stack is based on common usage and simplicity. )
Topic 1: Creating GitHub Actions CI/CD Pipelines for Ubuntu Servers (On-Premise and Cloud Providers)
GitHub Actions provides a simple platform for building pipelines.
Based on the lastest years of experience shared from deploying and maitaining servers and community versions, and provided DEMOs to several stakeholders.
Majority of implementors tend towards a common tech stack,
still, we have no common bests practices specifications for O3 and data consistency recommendations.
Monitoring and productivity: Majority of teams, have integrating monitoring and notification services (e.g., Slack, emails) into their pipelines to receive alerts and track the pipeline’s progress.
Still we also require Production monitoring, since the sucess of a Product/Platform is not defined on how well runs on my (developer) machine.
Topic 2: Ensuring Backup, Incremental Backup, and Automated Snapshots The provided Docker distro file offers a convenient way to deploy OpenMRS, but it’s crucial to ensure data protection through regular backups and snapshots. Based on the upgrades/migrations breaking changes, a step-by-step guide to achieve this would increase adoption. Backup strategy: Most teams do this on a release cycle basis, but this doesnt protect from disasters or several days of troubleshooting.
Setting up backup automation, with tools to not only provide regular backups but also visuals for better management. Implement incremental backups: Some of the teams have run into a scenario of lack of storage which blocks additional deployments, or proper functioning of O3 App, and .
Topic 3: Best Practices for Production, Tomcat, NGINX, and Monorepos Performance Optimization This has been one of the most common challenge, taking into consideration that implementators have limitations with Desktops or Laptops with reduced resources compared to some Devlopers workbooks.
Tomcat Performance Optimization: Fine-tuning Thread Pool: This would help based on the common configurations of single or multiple instances running on a single DB, or multi-tenancy. Connection Pooling, Caching, Load Balancing and Scaling: Memory Leaks: On a single instance with more than 300k patients, web interaction is awful (with better server resources compared to health facility).
NGINX Performance Optimization: Web Server Configuration: Fine-tune NGINX configuration settings, such as the number of worker processes, worker connections, and file descriptors, to match the expected load and maximize concurrent connections. Adjust buffer sizes and timeouts based on the characteristics of your application. Caching and Content Delivery: Based on multiple scenarios, its noticible that the caching approach of the @esm and the NGINX often get into conflicts that result into unexpected behaviors (multiple requests, sequential blocking requests, un-responsive tables).
Load Balancing and Proxying: Docker Swarm provides easy management of this aspect, but with existing distro config, issues arise to setup with multiple replicas for the db and backend services. Implementing it would also provide the implementers additional comfort for consistency and reliability of the overall platform.
Request and Response Optimization: Its not yet concessual wether implementing gzip compression, HTTP/2 support would bring improvement for handling of requests and responses.
I’ve go thorugh additional conversations encompassing above topics in several threads.
Still, from several engagements with other countries, similar questions, and old problems are still not answered, and this discourages adoption and implementation.
How do we collect existing information/knowledge and share practically ? How can we establish best practices and create a specification and collect real feedback from a pilot?
Cc: @eudson @alaboso @michaelbontyes @samuel34 @amugume @dkibet @sinte