Multiple Maven Repositories hosting org.openmrs.module artifacts?

As a follow-up from the thread on Artifactory and Bintray, and this great thread about who should have write access to our Maven Repository , I’m hoping to gain more insights into a recommended approach for organizations building an maintaining their own distributions.

TL;DR - Does it make sense to have multiple Maven repositories hosting artifacts under the “org.openmrs.module” group id, and if not, what should be our recommendation moving forward?

Per @cintiadr’s post, my understanding is that we have recently limited access to the main Artifactory Maven repository to OpenMRS Bamboo, which means that only modules that are built in an OpenMRS Bamboo Job can be published to our Artifactory.

We have made some exceptions to this policy for organizations with extensive pre-existing workflows, and my organisation (PIH) is one of those. But the question persists as to whether this is a suitable long-term solution and whether this is an approach that we can scale to any group that needs it.

In an attempt to experiment with having our own maven repository for PIH, I recently acquired an OSS Bintray license for our organization, and following the article I found here, attempted to see if I could configure publishing snapshots to their OSS Artifactory instance at oss.jfrog.org (OJO). I tried this out with one of our modules (rwandareports) to see if I could get it to work, and as I was linking this to Jcenter to activate Artifactory, I was posed with this question:

Enter a Maven Group ID under which your artifacts can be uploaded. Your groupId is expected to be uniquely used by you.

Well, the Maven Group ID for this module is “org.openmrs.module” - which I think is probably the case for 95% of modules out there, whether they are “OpenMRS community modules” or not - so this seemed like a bad idea, and I was left wondering what to do next.

My question comes back to this- does it make sense to have multiple Maven repositories that exist in the wild that all are all hosting artifacts under the “org.openmrs.module” group id, and if not, what should be our recommendation moving forward?

Why am I asking about this now?

I’m currently thinking about this for supporting our RwandaEMR distribution as we have several partners who are collaborating and several modules that are hosted under different github organizations - OpenMRS · GitHub, Partners In Health · GitHub, and github.com/rwanda-rbc-emr.

For many of the collaborators, using the OpenMRS SDK and distribution projects (both of which rely upon all artifacts existing in cloud maven repositories) is a new process, and I am trying to get this set up across this landscape. I imagine the same need exists in Kenya, Uganda, Nigeria, Haiti, and elsewhere and I am curious to learn from those groups (eg. @ssmusoke) how they have dealt with maven artifacts.

To add another wrinkle, even if I could set this up in a PIH Maven Repository, I’m not sure that would be a perfect solution as PIH is just one partner in the collaboration. OpenMRS is in many respects a more ideal organization under which to publish and house these shared build artifacts. In this case, OpenMRS Artifactory would be similar to OpenMRS Talk forums that focus on Kenya or Haiti or Bahmni - provides community tools and resources to enable more effective collaboration across groups.

So, this leads me back to publishing these country-specific modules and other artifacts to the OpenMRS Artifactory instance, and for us to establish a way to scale this if possible.

I’d very much like to hear other thoughts and ideas and potential approaches to this.

Thanks! Mike

3 Likes

@mseaton UgandaEMR is using the OpenMRS maven infrastructure until we shall not be able to do so anymore.

I am currently keeping an eye on the GitHub Packages Registry https://github.blog/2019-05-10-introducing-github-package-registry/ which is where we shall probably end up over time

I have always strongly voiced, that the OpenMRS Maven/Bintray infrastructure should support all open source community modules which are under the org.openmrs.module namespace - that way implementors do not spend time like you are trying to make things work.

The code is open source, and so why not share infrastructure too (happy to stay away from the more expensive Bamboo if there are costs). If partners need to contribute to the hosting costs, then so be it as part of our being part of a global ecosystem

cc @burke @paul @cintiadr @dkayiwa

1 Like

@ssmusoke thanks for the helpful comments. I think the practices that UgandaEMR is setting are very informative and helpful.

I had not seen Github Packages Registry before - thanks for linking to that. Actually the link you shared points to a beta from last year, but it appears that this is fully live and available. And it supports Maven including SNAPSHOTS.

Do you or does anyone here have experience using Github Packages? If we can move to consolidate around github permissions for specific repositories for both code authoring and code release/deployment, rather than separate access privileges needed for a centrally managed server, that would seem like an improvement, though there may be drawbacks to consider.

Thanks! Mike

From the maven (tool) perspective, to keep consistency, we need to make sure is that a certain groupId/artefactId/version doesn’t exist in more than one repository. An easy way is to ensure we don’t have duplicated artifactIds; another way is to ensure people have different groupIds.

I’m ok with either approach. We’ve been so far recommending the former, but changing to the latter is not a bad idea.

My problem is a far of scaling (and securing) this. I cannot grant writing permissions to everything to everyone. In order to keep different groups isolated (and make the attack blast radio smaller and prevent multiple CIs accidentally deploying the same thing in a racing condition), it does require a fair bit of manual work.

Also, how big the implementer has to be to be to be entitled to a repo? Do you have to have a community? Artifactory wasn’t created to support a use case like that, so it doesn’t seem the way to scale implementations (having someone with full control of credentials and so what).

We need to ensure our Artifactory is hosting artefacts we have control of (we cannot accidentally have a malware hosted by a bot), and also give autonomy enough for others to release.

Nope, it’s not about cost, it’s about giving scaling the community, including those who are still pretty small or creating their first module.

It needs to be a solution that works for students with their first module, and big orgs (not necessarily same way, but we need to cater for everyone).

When I say security, I mean this:

  • If my CI maven credentials gets hacked (because someone had a password like Password123), we don’t write malware to all UgandaEMR and PIH
  • If a small org/student gets their account hacked, all my modules are still safe

Artifactory is not a software written with that threat model in mind. There’s an expectation that all users are ‘trusted’ at some level. I certainly trust you two, but only /dev/5 would be able to do it? It’s unfair in my game.

If our Artifactory hosts malware, we could be affected as well by any browser or clients accessing it. There’s also an expection our hosts are only going to be serving good and trustworth content. So far, I’ve never been worried about downloading and executing any artefact from our maven repository, as it’s reasonably sealed.

I fear that ship has already sailed. It is likely possible to start encouraging more unique group ids, but so much of the community is used to using the org.openmrs.module package that change will take a lot of time. I wouldn’t even be surprised if there were assumptions baked into code around this here or there, though I really hope not.

Regardless, if JFrog provides an instruction to those who are applying to be able to use Artifactory for their code that their groups must be unique to them, there might actually be some checking done around it, and I didn’t want to make assumptions that it was just our own usage we had to monitor. So this led me to hold off knowing that there are org.openmrs.module artifacts in other Maven repos.

Also, in this particular case, the rwandaemr module already has artifacts in the OpenMRS Artifactory instance. So we’d have to have a very careful transition plan to ensure we upgraded code to a new version before starting to use a new Artifactory instance moving forward, and then even after doing so the same module would have artifacts in 2 different Maven repositories. Seems like it could get ugly. Is there a process where one could move from one Maven Repo to another?

Right, which is why I was trying to use a different Artifactory instance. But that’s where the issue above occurred, so what’s the right way to proceed?

I wanted to explore this option as well, so I spent a bunch of time trying this out, and it was super easy to get set up and I was really excited about it at first, but then I hit 2 major limitations:

  1. Every codebase ends up published to it’s own Maven repository. There isn’t one central Maven repository they all end up in. That means if you have 4 modules, and depend on each in your distribution POM, you’ll need to have 4 different <repository>...</repository> entries in your POM (one for each module).

  2. The above could be manageable if read access was anonymous. But despite the fact that the repository is public, code is public, etc. you can only install a package from Maven if you first authenticate using your github token to demonstrate that you have read access on the repository. So you need to have a private servers.xml file in your .m2 directory that contains entries for each of these 4 repositories, all with your github credentials in them. Ugh.

Until those issues are resolved (and in threads I have seen, github representatives seem to acknowledge this as a major limitation and problem), I won’t be looking at Github Packages as a Maven Repo.

For now, I’m sticking with the OpenMRS Artifactory, but it would be great to find a proven alternative that can play nicely in our ecosystem.

Best, Mike