Choosing a way forward for database changesets

burke · October 3, 2016, 4:28pm

We currently have a community-priority ticket suggesting we delete old changesets that is under assessment:

https://issues.openmrs.org/browse/TRUNK-4830

This is closely related to a ticket that Wolf Schlegel has worked on to reorganize our liquibase files based on best practices:

https://issues.openmrs.org/browse/TRUNK-3638

I wanted to bring the discussion from those tickets into Talk for community feedback. After speaking with @darius about this today, here are our thoughts:

Complete and merge the work from TRUNK-3638 to reorganize our liquibase files by minor version.
Update our liquibase-schema-only.xml and liquibase-core-data.xml to start at a more recent version (preferably latest minor version, but at least latest major version), to speed up initial installs.
Instead of deleting old changesets as suggested by TRUNK-4830, we keep them so they can still be run (even if not supported) by people who might wish to upgrade from older, unsupported versions of the database.

Thoughts? Concerns? Alternative suggestions?

We’ll plan on moving forward as described (updating tickets accordingly) by next Monday (10-Oct) unless we get some better suggestions here.

Cheers,

-Burke

wyclif · October 3, 2016, 6:01pm

Aren’t there any times when we try to fix a bug and have to add a liquibase changeset in a maintenance release?

burke · October 3, 2016, 7:06pm

Sure. I was assuming those changesets would go into the file for the minor version.

burke · October 10, 2016, 6:13pm

Bump. If we don’t hear any additional concerns by the end of this week, we’ll proceed as @darius suggested.

wolf · January 25, 2017, 5:29pm

Hey there,

happy new year everybody.

I started working on the ticket and have some questions regarding the suggested approach to versioning.

The last story I worked on was TRUNK-3638 which covered versioning of liquibase update files. That story implemented the following model:

v2.0 v2.1 … v2.6 v3.0

where all versions are included by liquibase-update-to-latest.xml in their natural order.

This story suggest a snapshot based approach to versioning the content of liquibase-core-data.xml and liquibase-schema-only.xml:

v2.0 v2.1 snapshot-A v2.2 … v2.6 snapshot-B v3.0

where the content of v2.0 and v2.1 is moved (and merged) into snapshot-A. Later, the content of snapshot-A and the files v2.2 to v2.6 are moved (and merged) into snapshot-B. When initialising OpenMRS after v3.0 has been added, the initialisation uses snapshot-B and v3.0 only.

I have two questions about this:

Moving and merging the content of multiple files into a snapshot is effort and feels like re-writing history in a versioning system. What is the benefit gained by this approach that justifies this effort?
Would a versioning model as used in TRUNK-3638 work as well?

I have added this question to TRUNK-4830 and Daniel Kayiwa suggested I repeat it here. Please reply by adding a comment to the actual ticket.

Many thanks, Wolf

dkayiwa · January 27, 2017, 1:23pm

@burke, @darius and others, do you have any response to the above?

darius · January 27, 2017, 2:55pm

(I have a bad cold, and this is complicated, so…)

As I understand it:

Problem: when you do a new installation, it takes an unnecessarily long time to run 10 years worth of changesets
Constraint: someone with a 10-year old OpenMRS installation should be able to run all the historic changesets and upgrage

Any solution that satisfies both of these is fine, and I think that those of you who have been looking closely at TRUNK-3638 and TRUNK-4830 (Daniel, Wyclif, Rafal, Wolf) can figure this out. (

If you need feedback from Burke and I, the we can schedule this for a design call, and look at it in depth. But I would not expect that to be necessary.

(Maybe Burke actually has an answer for this offhand.)

darius · February 4, 2017, 1:01am

@wolf, I’ve mostly recovered from my cold, and had a chance to think about this a bit more…

The problem we’re trying to solve, and makes this worth the effort is that when you install from scratch it takes an unacceptably long time to set up your database. This isn’t a problem for new production installations, but for a dev it’s a pain if this takes 10+ minutes. In fact I was reminded of this thread today when I did a fresh install to replicate a bug, and I went to lunch after starting the setup because I knew it would take so long.

I don’t interpret the ticket to be saying we should move changesets around. Rather:

we should reorganize changesets so each new changeset that is written is in a version-specific file (not a single monolithic liquibase-update-to-latest). (Maybe this already happened in TRUNK-3638, I haven’t looked at that.)

for major releases (and maybe minor ones, but not maintenance ones) we should generate new snapshots of liquibase-schema-only and liquibase-core-data. These snapshots are not assembled by merging all the incremental changesets, rather they’re whatever the liquibase equivalent is of dumping an (up-to-date) db schema so the output only includes "create table"and not “alter table”.
the OpenMRS startup machinery needs to be modified so that when we’re doing a first-time installation we use the latest snapshot, and from that point we only run the incremental changesets that come after that one. (I.e. if you originally installed version 2.0 then upgrade to 2.1 it would not run the 1.12 changesets.)

Other things that occurred to me:

when a fix is backported to different release versions a changeset should be copied (e.g. it could be in 2.0.4 and 1.12.6)
we should introduce some CI plans to test the different variations (or at least run some thorough testing alongside each new release to ensure that upgrades work as expected)

Does this make sense?

burke · February 6, 2017, 5:07am

@darius is right. The goal was to move from having a single, massive file of changesets to a separate changset file for each minor version. So, for example, while master is on 2.1-SNAPSHOT, we’d have

pre-1.9-changes.xml	All legacy changes to reach 1.9.0
1.9-changes.xml	All changes introduced in 1.9.x
2.0-changes.xml	All changes introduced in 2.0.x
2.1-changes.xml	All changes to be introduced in 2.1.x

Ideally, we’d be able to autogenerate a schema as we released each minor version, so, for example, if master was on 2.1-SNAPSHOT, we’d have something like this:

pre-1.9-changes.xml	All legacy changes to reach 1.9.0
1.9-schema.xml	1.9 data model
1.9-changes.xml	All changes introduced in 1.9.x
2.0-schema.xml	2.0 data model
2.0-changes.xml	All changes introduced in 2.0.x
2.1-changes.xml	All changes to be introduced in 2.1.x

While all of the changes could be run in sequence to update the database, a new installation could start with the latest data model and only run the (one, small) subsequent changeset.

As @darius points out, backporting a changesets would include duplicating it into earlier versions and updating any subsequent data models accordingly.

wolf · May 7, 2017, 5:22pm

That makes perfect sense to me, thanks Darius.

The last couple of months two hackathons and one talk about OpenMRS kept me busy, I am back at story 4830.