Migrating to Jetstream 2!

oclclient-dev would probably be good to keep up (it’s essentially tracking the master branch). oclclient-clone was intended to be a short-lived part and can be dropped, AFAIK.

1 Like

@raff I had deployed oclclient-prd to bele, not sure if we should delete from there, then? Also, we seem to be having certificate issues there with https://openmrs.openconceptlab.org/


Remaining machines

Please note I’m keeping vms docs always updated

  • jinka: redirects and website migrated successfully :tada: . mua and campo can be powered off. I’ll keep an eye, but if there’s any issues, you can manually change the DNS to the old server and it should automatically work.
  • maji: same as before. Discourse wasn’t starting last time I checked
  • goba and gode: miscellaneous services, I haven’t even started.

Known issues

  • I reckon backups for atlassian jira/wiki/bamboo is probably not working well and successfully
  • We still have the issue of having to restart LDAP every couple of months to pick up new certs. To be honest, I might have added new certificate issues there…
  • I was forced to add a -refresh=false on our terraform plans as it was attempting to create new data volumes. Not sure what’s happening, maybe it will solve itself on Jetstream

My bad I didn’t notice it. I deleted oclclient-prd from bonga and let it run on bele. Fixed the certificate issue.

Added oclclient-dev to bonga. I’ll leave the oclclient-clone config around, but I won’t deploy it unless someone asks for it.

1 Like

Update of the day:

  • maji has a discourse running. I was forced to move to stable. Will migrate talk over the weekend.
  • gode is down? Not sure what happened, didn’t touch it.

Thanks @cintiadr, @ibacher, and @raff for the migration. Great to see the progress!

I noticed we can’t edit any pages on the OpenMRS wiki (trying to edit any page returns a System Error page). It appears to be caused by: Confluence MySQL database migration causes content_procedure_for_denormalised_permissions does not exist error. The solution is to include --routines in the mysqldump command when backing up to include stored procedures that were introduced since Confluence 7.11.0. I see a mysqldump.sh.j2 ansible template. I’m guessing we’d want to add --routines to its OPTIONS, assuming this is what is used to backup our Confluence data. I’m leery to make these changes, since I don’t want to break things when we only have 10 days left to complete the migration.

Can we make a new backup from our Jetstream1 Confluence instance using the --routines option? I think we need this before our wiki will work again.

1 Like

In case you haven’t seem, Burke was correct, I copied the routines and it seemed to do the trick.


I come with bad news about talk. I spent hours trying to get the migration going. I wanted to finish it during the weekend, to disrupt you the least. It wasn’t successful at all.

Let’s see what they have to say.

1 Like

Today’s update:

Please note I’m keeping vms docs always updated

  • maji: I’m worried about talk. Hopefully the request we open will be enough help.
  • gode: staging for addons and atlas. Done :tada:
  • goba: migrated addons and atlas. Missing implementation, quizgrader, shields and radarproxy. Should be done this week.

I will continue to delete Jetstream 1 machines as the week progresses.


Known issues

  • I reckon backups for atlassian jira/wiki/bamboo is probably not working well and successfully
  • We still have the issue of having to restart LDAP every couple of months to pick up new certs. To be honest, I might have added new certificate issues there…
  • I was forced to add a -refresh=false on our terraform plans as it was attempting to create new data volumes. Not sure what’s happening, maybe it will solve itself on Jetstream

Maybe it’s because our split config only upgrades web by default. So, while our Talk might report itself as, say 2.9.0.beta7, it’s really only that version for the web component and an older version (last manual rebuild of data) for the data component. That could cause havoc for a migration that expects the data to be tests-passed but is getting data from some arbitrary older state.

Did you rebuild both web and data on prod before creating the backup for migration?

I’m creating a whole new server from scratch. I did delete all the data and rebuild both containers dozens of times.

So turns out the problem was the branch we were using to clone the discourse launcher. Somewhere along the line it changed from master to main, but our ansible continued to point to master.

The new talk server is empty, but finally up! with the new version I needed, 2.9.0.beta7. I will schedule to migrate talk probably in a few hours from now, my lunch time. I think it will be the least disrupting time.


  • maji: New talk is up. I will attempt to migrate it again tomorrow.
  • goba: I migrate all little things there. I’m not sure if radarproxy and shields are working… they had an empty screen when accessing from the browser, so I’m not if I broke something else.

  • Somehow bonga machine was tainted (marked for full recreation) in terraform. I undid that because I don’t think we need to delete it right now.
  • Previous known issues still apply
3 Likes

And here I thought GitHub was supposed to have some clever redirects to handle that!

Whoops! My fault! I forgot to undo that (when it lost connectivity I was originally just going to try recreating it before I found out I could solve it much more easily…)

Something we should probably do across the board in OpenMRS. I changed my default branch for personal repos from master to main years ago.

Yay! You’re awesome, @cintiadr!

You can test shields with https://shields.openmrs.org/plan/TRUNK/MASTER

You can test radarproxy with https://radarproxy.openmrs.org/openmrs%20radar.json

I added this info to our ITSM wiki, including a new page for radarproxy.

In any case, these both are working fine on goba. Thanks again @cintiadr!

1 Like

Alright, let’s see if this email lands in my mailbox. Testing testing.

I have a suspicious that it’s not technically viable to easily do that, due to how git works. That said, a warning on the discourse launcher logs would have kept me sane!


Alright, all machines are migrated! :smiley: Took a hot minute, but here we are.

All machines are migrated! I will slowly deleting all the other machines, a few per day. On the weekend, I will delete the old networking as well.


Probably creating tickets for all follow ups tasks.

4 Likes

Yay!!! I hereby declare August 2022 is “OpenMRS Cintia Month” to acknowledge the heroic efforts you made in transitioning our infrastructure to Jetstream2 this month!!! :partying_face:

-Burke

/cc @jennifer

p.s. They will likely pull the plug on Jetstream1 at the end of this month, so don’t be surprised if it is unreachable as of 1st August.

3 Likes

I’m really glad I took time in the past to automate everything with terraform, ansible/puppet and docker, as well as all the backups.

We changed datacenters, network, recreated all machines, upgraded operational system, upgraded all atlassian tools, and all services (except talk) went out pretty smoothly.

If things weren’t automated, we’d probably have a lot more work to get it done. Yey to automation!

3 Likes

Thank you @cintiadr! It’s amazing achievement! Great to have you with us!

Yey to @cintiadr and automation!

This is amazing! Thanks for all your work on this @cintiadr!!!

@cintiadr You overcame so many obstacles to achieve this win. Your perseverance is an inspiration to to me , and you earned all the rewards coming your way." I always knew you could do it, and I’m incredibly proud of you."

You really deserve this recognition. Thanks @ibacher and @raff for always checking on our super girl.

1 Like

And all Jetstream 1 machines should be now powered off. :upside_down_face: RIP little thingy. I also deleted all network components via terraform.

Our automated docs are reflecting that. Datadog will soon show all those machines as gone (they are currently inactive, I think it takes 24h for datadog to give up on a machine).

I cleaned up all the old code in ansible and terraform to remove all the things needed to support ubuntu 16/20 and Jetstream 1.


In order to get Terraform working for you moving forward, make sure to update your openrc-personal file and remove the Jetstream 1 creds. You can see that Jetstream 2 creds were also renamed.

Also, make sure to run ./build.rb init docs, to install new modules on that folder.

Everything else should work exactly as expected. I’m still to confirm if our backups are working, but that doesn’t need to happen this weekend :smiley:

And with that, I declare us officially migrated.

2 Likes

Woohoo!!!