Adjustment to 3.x Frontend Process - “Test” Environment

zacbutko · April 26, 2022, 12:50am

Hi Everyone,

I’ve noticed some gaps in our process that I would like to propose a solution to, but would first like to discuss my philosophy here. Let’s first think about what philosophy best fits our community, and that can help structure arguments over how and if we should adapt our processes.

Situation:

Our development process currently goes something like this: A developer pulls the latest code locally, develops on their machine, makes a PR, and after approval this is merged into ‘main’ or ‘master’ branch. Anytime a PR is merged this code is automatically incorporated into the dev3 instance (https://dev3.openmrs.org/ ), which has the benefit that anyone can instantly see it, but also if some fatal bug is accidentally introduced now dev3 is broken for everyone. From there we have the option to manually create a ‘version’ on a per-repo basis. This step has not been done in almost 3 months. This newest version updates npmjs.com and becomes the ‘next’ package as well as a stable identifier (i.e. 3.2.1-pre.1149). Next, O3 (https://o3.openmrs.org/) can be updated manually via the “release” stage of the CI build, which is set to build with whatever the “latest” version is. Because this process is run manually there is some control over what ends up on O3, although since there is no testing before a release is cut the quality is not guaranteed, is usually ‘yellow’, and most of the time is in fact not known. Also we are unable to answer “what features are on O3” without looking at the release notes (hopefully it is documented there) for each individual package. Currently even this minimal process is not being attended to and O3 is almost three months behind dev3. O3 being out of date and ‘not very stable’ is often frustrating for those using O3 to demonstrate the capabilities of 3.x product to prospective clients.

In summary, currently we have a process which seems geared towards “move fast and break things”, and I would say even so fast we don’t know (or at least I have a hard time answering) what features have been implemented and whether they are working or not. Because there is no quality documentation the community relies heavily on asking one another what is working which is poor because it is 1. Slow, 2. Limited by that person’s understanding of the product. 3. Information will be missing because the feature set is so extensive.

Ideal Process Qualities:

This is the important part. Please disagree with me here or add additional requirements. Also, to preface, yes some of these are already existing. An ideal process should be able to do the following things:

Allow developers to constantly develop new features and bugfixes (Dev)
Have a stable production-level showcase site (Demo).
Versioned reference application (RefApp) which can be tested for features and quality on an instance which is isolated from Dev and Demo (Test)
Ability to select specific releases of each repo to include in a RefApp version.
Ability to select a stable RevApp version to promote to Demo.
Ability to make hotfixes to RefApp on Test in order to improve stability before promoting to Demo.
Ability to ‘roll back’ a Demo version to a more stable one if desired.
The RefApp deployed on the Demo should at all times have a known list of features and flaws
Publicly available historical RefApp quality records to globally show when features and bugs were introduced.
Constant release cadence. Releasing from dev to test once a week or more, releasing from test to demo once a week.
Streamlined process allows ultra fast releases- Ability to release a new feature from Dev to Test to Demo within an hour. (or as Flickr says, “10 full releases per day”)
Tight Develop-Test cycle. Bugs are found soon by testing and brought back into the ticketing system.
A process which people will actually use.

Proposed solution:

Through all of the above I am hinting towards the need for a third deployment site which we can call Test. This will be isolated from fast moving Dev so that the package of packages is stable long enough to test, and also isolated from slow-stable Demo since candidate versions on Test might fail a quality check, and then not be desired to be promoted to Demo.

How it works:

PRs and merging work as before. All merges result in an update to Dev instance.
Versions are made manually as appropriate. Depending on the cadence it is allowable to have several versions of a package released per day.
When a version is cut npmjs.com gets updated, and this can immediately make a CI Docker image.
GAP- It should be possible (for the purpose of hotfixing) to manually select the versions of each repo to include in the CI image. With our current setup I’m not sure how this is possible.
Images are available to promote to Test
An ‘Evaluator’ - (QA, Developer, Product Management, Business Analyst) - chooses an image to promote to Test. They either run a quick Smoke Test (10 minutes), a Change Test (10-30 minutes), or a Release Test (1 - 5 hours). Notes from this test are kept on the community Wiki
If a Test fails the quality check then packages which are causing errors must be updated either on the main branch or a new release branch, versioned, and promoted to Test again.
If Test passes the quality check we can then promote this image to Demo.
If some bug is later found on Demo it is possible to redeploy an older image (and we have documentation to know which are good)

New Dependencies:

This new process requires new dependencies:

Test environment
Storing release images in Artifactory
Testing

Who is responsible for testing?

For now community volunteers. This is an interesting topic to support for anyone who wants testing to be done. I’m signing myself up to start the first couple releases because I want to learn more about the product and be able to effectively communicate status and trajectory to stakeholders. The Product Owner (Grace) generally has a lot of interest in this being done so they can know the state of the Demo and prepare properly. Anyone who is selling or implementing this product has an interest in knowing which versions are stable. Testing the app is a great way to quickly introduce the app newcomers, interns, fellows. One interesting process we can try is to have a rotating duty where each developer will be responsible one week for testing. There is hardly a need for a full time dedicated staff here, although maybe as the process matures one testing expert can be available ¼ to ½ full time to do testing activities and be available to mentor anyone interested in helping with or learning about this process.

I think having both a Test environment and the quality assessments will help us have an up to date and stable Demo environment, help us quickly visualise product feature implementation and health, and quickly spot and fix bugs. The transparency will improve team alignment, act as a tool which can guide conversations, and lead to a much improved onboarding experience.

Please let me know your thoughts. Thanks,

bistenes · April 26, 2022, 2:08am

I think this sounds good, but the people whose feedback is really critical here are people closer to the clients—e.g. @grace, @jdick and @mksd .

Can I ask why you’d want to use Bamboo rather than GitHub Actions, though?

burke · April 26, 2022, 5:15am

Historically, we’ve used QA, UAT, and Demo environments:

QA = quality assurance, automatic deploy (i.e., dev)
UAT = user acceptance testing, manual deploy (i.e., test)
Demo = demonstration site, manual deploy (i.e., demo)

I would favor anything that takes in the direction of:

Increasing automated testing and reducing dependence on manual testing. As for automated testing… whatever balance we can muster that keeps lights green and lets us spend more time developing new code & new tests relative to fixing fragile tests.
Embedding testers/testing into our dev processes rather than having separate developer teams and testing teams (everything I read suggests this is an anti-pattern).

zacbutko · April 26, 2022, 5:22pm

I am pipeline agnostic, although I would say the pipeline should be able to support the following:

“Pointy Clicky” - A GUI that is very easy to understand and use.
Anyone with a minimal understanding of what an image is can choose which image to deploy to which site.
Permission locked. Not everyone can click the button.
Deployments should be able to be rolled back
It should be apparent which images have which versions of each package. These builds should be labeled so they can be referenced in tests and found again on the long-term image store (Artifactory)

Is there an obvious candidate here @bistenes ? What would you suggest?

zacbutko · April 26, 2022, 5:39pm

No testing or QA should be done on dev. This is not a good idea for the reasons I’ve outlined above.

No one will argue that automated testing is not important. That is an active topic for another thread. However we do need humans in the testing loop. If you don’t put a human in the loop then the first human to catch bugs will be the clients and none of us want that to happen. I would like to remind that the process I’ve outlined above has a 10 minute / week testing burden, which I would say is a minimal dependence.

Embedding testing into our development process is precisely what I’ve outline above. Devs should test. PO should test. BA should test. This isn’t a separate team but everyone collectively who is responsible for the quality of the product that we ship.

bistenes · May 3, 2022, 9:28pm

I’m interested in hearing what @grace and @mksd think about this? I think that they’re maybe the most interested parties.

dkayiwa · May 4, 2022, 8:51am

There are pros and cons to your very nice proposal.

If the people who have used the dev server for testing have found it broken such many times to be worth the extra effort of maintaining another server, then i would vote yes to your proposal. But if it is the opposite, then you know what my vote is.

What i have always loved about testing from the dev instance, has been faster manual notification (yelling from testers) when anything is broken, given the fact that our test coverage is still low to be relied upon for automatic notifications. The second reason is that it pushes us to add more automated tests in our pipeline, which helps us get towards continuous deployment where, as developers, we are confident enough to know that any commit which is not fit for production (in our case which will break the dev instance), will automatically result into a build failure and hence not getting deployed to the dev instance.

jwnasambu · May 4, 2022, 9:06am

@dkayiwa thanks so much for this explanation I now understand why QA server is preferred for testing.

zacbutko · May 4, 2022, 5:37pm

Thanks for looking @dkayiwa

I think dev server is working very well. Instead our demo site is the one which is constantly broken.

I am very much in favor of keeping the dev instance. I think this is exactly as you say the best place for developers to get immediate feedback if their integration works with the rest of the app. Definitely there is still a need for automated testing and as you say have it be as far upstream as possible. We already have it working that the deployment will fail if the tests do not pass, and that is a good thing.

After talking today with @dkayiwa and @burke and separately with @mksd it actually sounds like we are all in agreement that a test server should exist, and actually is very similar to the setup that OpenMRS had before with the 2.x frontend (uat-refapp), but the the community has not yet implemented this for 3.x. Talking with @ibacher he had some good ideas for how we can make this happen in Bamboo so it is as user friendly as possible.

If no one has objections Ian and I can slowly donate some time to set this up over the next weeks.

mksd · May 4, 2022, 6:10pm

Useful reminder indeed. +1

zacbutko · September 29, 2022, 10:51am

A couple months later, we have made some progress, and the end vision is slightly different:

We (or at least I) have come to realize that local (docker) setup is very important piece of this puzzle, and is missing from original diagram. Ian has been working hard on transitioning the whole pipeline to be using the same docker containers, so this is at least in progress. From our call yesterday it sounds like that is its own domain with many features still to be made to make it fast and predictable
Dev3 has a PR up which will reduce its dependence on the CDN, making it more predictable and less error prone. This is a work in progress, so the completion of the PR does not necessarily mean the process will be complete.
Test3 is up which has been a huge help already, for a) quickly verifying breaking changes in dev against what was working before. and b) as a backup remote backend when dev3 goes down. Also, the product / qa team has already stated testing against my proposed requirements ruberic to give some idea of stoplight-status as project progresses. Next steps here are to push to test regularly, record a smoke test and image number, and promote passing images to demo
Demo is up and looks fairly recent, and is pre populated with data. I think this is a great improvement over last few months, when it was stale since about March.

It seems like we are starting to have the infrastructure pieces in place and now the goal is to fine-tune the process so that we are shipping constantly. In order to do that the next steps are

Dial in the local docker setup so that FE devs are never blocked and have consistent build environments with no external dependencies to run as sandboxes. This will help isolate network and package load issues, and allow more disruptive test configurations (making many changes to the database).
Focus on integration testing both with the QA test-3refapp and by bringing cypress ci tests to github actions once we merge monorepos. A good example of we have tests but they don’t catch the right thing yet, the patient-registration app has broken several times in the last few months even when all CI tests are green. Adding a combination of the two testing methods above will allow us to catch better bugs so that registration doesn’t keep breaking, which will ultimately allow us to develop faster
Test regularly on test3, record the results, and then promote successful candidates to demo. Using the process will help us get better at it, and hopefully brings much needed transparency as to the state of O3

Maybe I put this into a nice bullet point list, but there is a ton of work to do here, both for setting up the infrastructure and for the ongoing maintenance once it is running. DevOps affects many many stakeholders who all are requiring that CI pipeline have high availability, or else unfortunately can be totally blocked costing teams hundreds of engineer-hours. It is critical we have a dedicated resource here.

Anything to add @ibacher , @raff , @dkayiwa , @dkigen ?

raff · September 29, 2022, 11:49am

I agree it is crucial to have devs not rely entirely on one environment for development (dev3).

We also need to add sourcemaps to O3 frontend image so they are available for debugging issues with o3 frontend as a whole and not just individual components.

We now have installed versions of O3 components listed in dev tool, but it would be great to include a repo link as well for convenience.