How to reduce OpenMRS 3.x Docker Failures

jnsereko · August 25, 2022, 2:46pm

In the o3 QA test tooling, we spin up the nightly docker instance for OpenMRS 3.x using a docker-compose reffapp 3.x file

However, for the past 3 weeks, i have seen real drama with this docker instance.

For about 14 days the patient chart was not loading (kept on displaying loading) so i had to wait for it to start working so that i continue my work on testing the patient chart.
On Thursday last week, a build was pushed and it worked, hooray !
Tested it on Monday or Tuesday and was working very well, hooray !
Yesterday, i pulled changes and the server wasn’t working, we had to delete all the volumes and rebuild to make it working. The backend finally worked, hooray , but got a blank screen http://localhost/openmrs/spa/login
I deleted all images today, and rebuilt everything. now http://localhost/openmrs/spa/login` is returns a server without css

Because i don’t have that much experience about the release of the Docker images we are using, i have to just wait until a release is made and then pull changes. This is really slowing down our progress because it is logically meaningless pushing code you are sure is gonna fail.

If spinning the instance on local machine requires this much tweaks, how. tricky is it gonna be on GitHub Actions?

How can we improve the the o3 docker release workflow so that it is less erroneous less painful to those running it

cc @raff @grace @dkayiwa @zacbutko @dkigen @jayasanka @pasindur2

PS: We caught a failure in Patient Registration so that a simple success story

zacbutko · August 26, 2022, 11:16am

Thanks as always for testing @jnsereko. I think what you are seeing at least with patient chart is not so much a break down of the docker instance but real testing failures in the reference application. I know that sometimes it’s hard to tell when the testing setup is slightly broken and you can fix it, vs what is really upstream failures . For the testing side I would lean more towards assuming it is a system failure and reporting regularly when a build doesn’t even compile. If the application or patient chart doesn’t even load, then QA gets to go home early.

I appreciate you being vocal about feedback. Thank you for bringing this up in Talk. Really it would be awesome if QA is reporting testing status to frontend, platform, and devops whenever a test is complete, so as often as you do it. Also it is important to track tests vs releases, so really documenting test outcomes session by session and storing these reports in a centralized place that the devs can refer to helps us figure out what went wrong when.

Thanks again for testing! This is very helpful!

raff · August 30, 2022, 2:33pm

@zacbutko seems to be right here that it’s an issue with code and not docker setup. There’s a number of steps we need to implement:

We need a build of RefApp 3.x whenever changes to its components are made. To make it easier I’d start with building it at least once a day with the most recent components available.
We need a way to easily track back, which components have changed between builds of RefApp 3.x. It will be easier to find components and commits that broke the build. The versions should be included in Bamboo logs or ideally as a github commit.
We need basic automated UI and REST API tests of RefApp 3.x on each RefApp 3.x build. It is in order to have a way to fail a build if something is broken and look into the issue.

grace · August 30, 2022, 6:31pm

Thanks @raff. Can you join @ibacher & @dkayiwa & @jnsereko on this Wednesday’s platform team call? (7pm CEST) We would really like to verbally lay out who is doing what on the o3 pipeline and builds as it seems different wires are getting crossed.

raff · August 31, 2022, 10:00am

Yes, I’ll join, thanks.