Hello everyone, I am studying the “Support for horizontal scaling of OpenMRS instances” Project idea. So far, I have researched the 2nd scope mentioned in the Project: Adjust code in core and O3 modules to use distributed caching. I want to share the architectural approach which I think is best fit for the scope and requesting reviews from mentors.
TRUNK-6302 introduced Infinispan with the purpose of replacing local Cache Layer. However, Infinispan is installed at the platform level while the legacy modules and O3 modules still rely on local Cache. The purpose of the project is to find those hidden local cache across the OpenMRS Core and O3 modules to fully utilize Infinispan. Also externalising the HTTP sessions, and optimising heavy FHIR/REST endpoints could be some other objectives.
The implementation revolves around 3 points:
-
Removing of Local Static Caches and HashMaps We have to remove single-JVM memory traps like HashMap, ConcurrentHashMap across the Spring Singletons. Also any legacy usage of Guava caches or caffeine caches because they handle TTL but fail to invalidate across multiple nodes. Legacy API Calls should also be removed. We will replace them with @CacheableCacheableCacheableCacheable annotations. OpenMRS uses Hibernate to talk to MariaDB. When we ask Hibernate for a Patient, It gives us a Hibernate Proxy. If we use standard Java Serialization (java.io.Serializable) to send that Patient over the network, Java tries to serialize everything. It touches the encounters list. Hibernate may wake up and run a SQL query ultimately leading to a LazyInitializationException over the network. So the solution would be to use .proto for distributed domain objects.
-
Implementation of HTTP sessions in cluster: We will have to introduce a new class annotated with @EnableInfinispanHttpSession to automatically intercept a HttpServletRequest and redirect to Infinispan distributed cache sessions. The risk here would be some legacy code having non serializable objects (like massive UserContext graphs) which on cluster state-transfer would give NotSerializableException.
-
O3, REST, and FHIR Acceleration:
We can implement distributed caching on expensive translators (like LocationTranslatorImpl) to speed up O3 frontend rendering. If we cache these translated FHIR resources, the tricky part is invalidation. If an admin updates a Location via the core API, the FHIR cache becomes stale. Is there any existing standard event-bus or listener pattern that the FHIR module can use to trigger this eviction. For massive background tasks like Cohort Builder we can use Infinispan’s Asynchronous API (putAsync) so we don’t block Tomcat threads.
We can define templates in infinispan-api.xml(local,distributed,invalidation). Just a minimal cache-api.yaml will be required for extending these templates.
Any insights, warnings, or historical context on these specific areas would be very helpful!