New storage service

raff · January 17, 2025, 10:57am

In order to improve scalability of OpenMRS we need to design a new storage service that will be used instead of writing at will directly to the application directory. It needs to be used by core as well as modules that write files to disk. One example from core is ComplexObs data handler and an example from a module is persisting reports by a reporting module.

The service requirements:

Promote best practice on working with streams instead of file handles to avoid loading disk data into memory.
Have at least 2 implementations on writing to disk and s3.
Do not allow for overwriting. Data is immutable. You must delete and create a new object. It is in order to prevent reads by other threads while overwriting.
When persisting indicate which data is:

replicate (needs to be replicated as soon as possible),
deferred replication (can be replicated at convenient time e.g. large PET scans),
no replica (can be recovered by other means, no replication needed),
temporary (can be discarded, available only for a single http request, stored in local storage even for s3 provider)

I’d welcome suggestions for the naming and the addition of other tags. Ideally the tags would be general enough regardless of the underlying provider. Some could be ignored like replication settings for s3.

Makes it easy to backup data tagged for replication and ignore temporary or no replica (useful for local storage, less needed for s3 with versioning).
Read data stored by other services than OpenMRS given e.g. URI.
Provide unique URI for each data object to store it in e.g. DB.

The service implementation would not be responsible for replication or backups, but it needs to clearly indicate for an external service how to treat this data, e.g. for local storage we would put files in separate directories and have e.g. longhorn apply different replication and backup strategies to those directories.

These are the features I have in mind now. I’d be curious to hear from others esp. about modules that write to disk to include different corner cases.

@jacob.t.mevorach, @burke, @ibacher, @dkayiwa, @mseaton, @mksrom, @angshuonline

See also Enabling Horizontal Scaling in OpenMRS - #7 by raff for some background.

ibacher · January 17, 2025, 1:25pm

What occurs to me as possibly problematic are situations where we read files from the application directory, but never write them in-app. This is, for example, often how HTMLFormEntry forms are loaded (by passing a file path relative to the application directory as an argument). Similarly, module-spa is designed to serve the O3 frontend from a folder of files, but it doesn’t handle placing the files anywhere.

In terms of stuff that writes to the application directory that isn’t covered here:

Attachments generates image thumbnails for images that are saved in their own folder
Initializer writes checksum files so it doesn’t reapply the same changes

raff · January 17, 2025, 1:42pm

Thanks! Those examples are what I’m looking for.

The approach for DB initialisation and applying Initializer upgrades only once is to have a single replica that is tagged with an environment property as RUN_INITIALIZATION or writing some flag to DB that one of instances is running initialisation and stop others from starting up until done.

Anyway Initializer should probably write checksums to DB. Image thumbnails fit well in the storage service.

We need to look in the HTMLFormEntry. @mogoodrich could you please point us to some code?

I don’t expect the spa module to be used in cluster deployments so it’s not necessarily needed to make changes there.

mseaton · January 17, 2025, 2:06pm

@raff / @ibacher -

I don’t think the htmlformentry use case will be a problem, as those are read-only and installed from initializer. So as long as all nodes are initialized at startup with the same configuration, it should not be a problem. They are not updated as the app is used and are not considered data.

If it is a problem, we could look to untangle it. The approach really started as a convenience to ensure htmlforms in the database were updated and could be reloaded with changes visible while developing forms without the need to constantly redeploy. But now we have lots of code that is configured to load configuration at runtime from the filesystem.

Note: this isn’t limited to htmlforms. There are other configuration files within the .OpenMRS/configuration directory that are not just loading by initializer at startup, but are loaded at runtime by the app. Some specific examples are images, css, javascript, and anything that the UI framework is able to load as a resource with the file: prefix, which is able to load files within configuration and subdirectories. Again though, this is all read-only once the application starts up, so hopefully that isn’t a problem.

Mike

raff · January 21, 2025, 10:08am

Thanks @mseaton! Yes, read-only config files are non issue here and we will not be using the storage service for them.

I’ll be joining the platform call tomorrow if anyone wants to discuss the new storage service and some of the imminent goals for OpenMRS instance replication.

raff · January 22, 2025, 8:00am

Let’s discuss on the platform call next week instead of this one. Thanks.

burke · January 23, 2025, 2:01pm

Here’s a suggestion for tags that might be more intuitive for developers (thinking from their application development perspective rather than the horizontal scaling perspective):

PRIORITY_BACKUP
BACKUP (default if omitted)
NO_BACKUP
TEMP

raff · January 23, 2025, 3:43pm

Backup is different from replica. Replica is a copy of data kept up to date with the source. The point of replica is to improve performance and speed up failover.

Backup is a copy of data taken at a point in time and stored securely. The point of backup is to be able to restore data.

I can imagine some data to be marked as replicate and backup, but some e.g. db volume only marked as backup since db handles replication on its own.

raff · February 28, 2025, 10:12am

We have gone through some design changes and ended up with the following StorageService:

And refactored openmrs-core to use it for ComplexObs at:

I didn’t implement tags to simplify migration of old data and storage. It will be up to admins to either create a single volume for all data or separate volumes with different replication/backup policies.

There’s also StreamDataService introduced, which uses a pipe mechanism to move data without making in-memory copies. It is to promote best practice and make it easy to work with streams instead of in-memory copies.

StorageService and StreamDataService will be backported for modules to migrate to new mechanism. The services won’t be used in core until 2.8.0 release, where we start to use it for CompelxObs.

openmrs-core 2.8.0 may be the first release that we plan to be ready for horizontal replication i.e. multiple openmrs-core replicas running at the same time to provide high availability and distribute load. Please follow the progress at TRUNK-6299.