Limitations of Migrating from OpenMRS ID to a new SSO System

cintiadr · June 11, 2023, 8:42am

Indeed, that’s something I never came back to.

Last time we had an update from those atlassians, they let us know that SCIM would not be a solution for us. It does provide authorisation only, but not authentication.

The only reasonable solution seemed to be we have to creating an SMTP relay, or finding a cloud server that could do it for us.

Do we think we have someone that could experiment with that?

cintiadr · June 11, 2023, 9:32am

I’ve added to slack, but probably here’s a best place to document. We have the following options:

Not attempt to use any magic. Atlassian jira and wiki will have its own user base, and authentication will have nothing to do with talk. That will probably lead us to disable external users and OpenMRS ID altogether
We create a subdomain to all our users, e.g. cintiadr@users.openmrs.org. That will require us to maintain the SSO system (or use a vendor) that supports SAML, and also maintain the SMTP relay (or use a vendor). That will keep the OpenMRS ID the closest to today, as everyone will use the same username/password to login to talk, jira and wiki.
We maintain an identity system that integrates to Jira/wiki via SCIM. That means that the username and email will be the same in talk and jira/wiki, but the password or way to authenticate (e.g. google) will be completely independent between systems.

burke · June 14, 2023, 4:50pm

This sounds like the “nightmare” scenario. We lose provenance of all content (i.e., any content you added to the wiki or JIRA prior to migration is no longer yours after migration), worst experience for community members (have to manage separate credentials, losing ownership of their prior contributions), and worst experience for infra team (lots of more work juggling who should have access to what, which user in Atlassian is which user in Talk, and less opportunity for automating access).

I’m still hoping we can work this out, for example: KeyCloak + LDAP + Postfix. This would provide the least disruption to the community, avoid quotas or cost limits imposed by vendor solutions, avoid the chaos both for community members & infra team of juggling & sorting independent credentials across services, give us a fighting chance against spammers, and keep the door open to some automation down the road (e.g., granting access to services based on trust levels in Discourse).

It looks like we’re not alone in considering KeyCloak with Atlassian Access, but KeyCloak is not an officially supported IdP by Atlassian.

One key missing piece is figuring out how we could reliably & sustainably redirect email from Atlassian a custom domain (e.g., @user.openmrs.org) to actual user emails within LDAP. Could a targeted Postfix solution pull this off without requiring a lot of maintenance, risk becoming a spam relay, or making emails through this custom domain from appear untrustworthy (and end up routing to people’s spam folder by default). Or maybe there is an affordable & scalable alternative to hosting our own Postfix workaround.

This would be better than the first option, but would still be an annoyance for community members. I’m hoping we can do better than this.

jayasanka · June 26, 2023, 8:13am

Hello there,

In the past few days, I have been going through this task, and here are my findings:

OpenMRS has approximately 12,000 user accounts, but the majority of them are inactive. We expect a maximum of 4,000 active users at peak usage, so the IDP should be able to support this number of users. Moreover, the pricing should be manageable for OpenMRS to afford.

Another requirement is that the IDP should support a mechanism to export the user base in case it is needed for migration purposes. Additionally, it should support login from unverified domains.

Identity Providers listed by Atlassian

After conducting research on various identity providers, the following information was found regarding their pricing:

Note: Atlassian currently does not support unverified domains. Therefore, using the identity providers listed directly may not help. Atlassian has mentioned that they are working on supporting external users here, but there is no clear expected date. However, they have mentioned the feature is available for workspaces with early access here.

Auth0

Pricing: $1420/mo (7000 users)
Discount: 50% discount available for non-profit

Azure AD

Pricing: $6/user/month
Discount: No discounts available

Cyberark

Pricing: $3/user/month
Discount: No mention of discounts for open-source projects on the document

Google Cloud Identity

Pricing: $6 USD per month
Discount: No mention of discounts for open-source projects on the document

JumpCloud

Pricing: Starts at $4/user/month
Discount: No mention of discounts for open-source projects on the document

Okta

Pricing: Starts at $2/user/month
Discount: No mention of discounts for open-source projects on the document

OneLogin

Pricing: Starts at $2/user/month
Discount: No mention of discounts for open-source projects on the document

Ping Identity

Pricing: No publicly available pricing
Discount: No information available on discounts for open-source projects

Here’s a summary:

Identity Provider	Pricing	Price per User	Discount for Open-Source
Auth0	$1420/mo (7000 users)	$0.203/user/month	No (50% for nonprofits)
Azure AD	$6/user/month	$6/user/month	No
Cyberark	$3/user/month	$3/user/month	No
Google Cloud Identity	$6 USD per month	$6 USD per month	No
JumpCloud	Starts at $4/user/month	$4/user/month	No
Okta	Starts at $2/user/month	$2/user/month	No
OneLogin	Starts at $2/user/month	$2/user/month	No
Ping Identity	No publicly available pricing	No publicly available pricing	No

Keycloak

To gain a better understanding of the fundamental concept tied with the task, I configured KeyCloak with our LDAP.

Fortunately, I was able to connect KeyCloak with LDAP successfully. To provide a reference for others, I created a dev environment and pushed it to Github. You can access it via GitHub - jayasanka-sack/openmrs-keycloak-ldap.

Below are some screenshots of the setup:

!https://s3-us-west-2.amazonaws.com/secure.notion-static.com/485d6cb4-476f-466b-a873-a4f33cd74b82/Untitled.png

The next step is to explore other self-hosting options for Keycloak in order to select the best solution that is compatible with our needs and has higher security.

Email Redirection

As Atlassian cloud does not support unverified domains, the only viable solution would be to introduce an email alias. This has been discussed here: GSoC 2023: Limitations of Migrating from OpenMRS ID to a new SSO System - #27 by cintiadr

I reviewed possible solutions and here are my findings:

Postfix

Postfix is a popular mail transfer agent that can be used for email redirection. It is free and open-source software that runs on various operating systems. With Postfix, we can configure email forwarding rules based on the sender, recipient, subject, and other criteria. Therefore, we can easily forward emails that are sent by Atlassian. Some advantages of using Postfix for email redirection include:

It is free and open-source software.
It provides advanced customization options for email redirection.
It does not have a pricing model since it is open-source software.

Based on our previous discussions regarding the requirement to restart the service after adding a alias, I have discovered that it is not necessary to do so. By executing the “reload” command, Postfix will re-read its configuration files and apply the changes without requiring a full restart. This allows us to add new users or modify existing configurations without interrupting the email service.

One downside is that emails may end up in spam. Therefore, we need to properly configure the service to avoid this.

Cloudflare Email Redirection

Cloudflare offers email redirection as part of their Cloudflare Pages service. However, it appears that the service is still in its beta stage. You can read more about this on their blog post, Email Routing leaves Beta.

Alternatives

Alternatives to Postfix include using a mailing service such as Mailgun or Sendgrid. Prices usually range from $35 per 100,000 emails per month. Since Jira and Confluence send email updates to users frequently, we might need a higher quota.

What’s next?

First, I need your input on the above findings.

Meanwhile, I’ll look into the following:

Research other viable alternatives to Keycloak.
Connect JIRA Cloud with SAML and test the integration.

cintiadr · June 26, 2023, 11:09am

So I guess this is the desired order for the solutions we have:

SAML + email redirect (authentication and authorisation)
SCIM (authorisation only)
Nothing and drop OpenMRS ID as a concept

So let’s go full ahead on option 1 until we exhaust it.

@burke, do you think we should email some of those identity providers and ask directly if they offer an open source licence? I’m not convinced they’d care the slightest, but google and microsoft may be willing to give us that.

If they don’t give us the licence, I agree that keycloak is probably our best option. I’m reasonably convinced we may even have tried it with jira cloud, details are fuzzy, but I don’t anticipate much drama there.

Here my rough recommendations for next step:

The biggest risk, in my opinion, is connecting keycloak (or our user storage) with the email redirect. That includes when creating a user, updating email, or deleting it.

I’d love to try the cloudflare one and know the cost for us. Otherwise, I’m assuming we will have to hack together a keycloak module.

I’d be willing to change the underlying user storage in any shape or form to get this connection as put of the box as it can be. If that means postfix, so be it.

If keycloak alternatives offer better integration with postfix for what we need it to do, I’d be keen on exploring it.

With that, we can create a server and manually configure SAML and email redirect (e.g. keycloak and postfix) and get a new jira/confluence instance and test it out. No need to use Ldap, use the most vanilla keycloak config you can get away with.
We may want to test if you can get a discourse instance to use keycloak to login.
I’d rather not run our own LDAP if we can. If we could use AWS or azure to keep the data storage, that simplifies our backups and everything. I’d rather we don’t own ldap, it’s a huge annoyance.
With all of that, we will need a migration plan. Do we migrate everyone? How jira and confluence will behave if the user cintiadr had a different email before? which order do we migrate to the cloud, do we connect the existing jira and confluence to the new ID?
We will also need to automate everything. Data storage (thinking about backups), SAML service (e.g. keycloak), redirect (e.g. postfix), integration between them. Security will now be a huge concern.

Well… then we can do the plan on step 5

burke · June 27, 2023, 4:25am

Could Postfix ldap aliases be leveraged to look up users via LDAP instead of needing to maintain a separate database of usernames?

For example, Norman Schnoggenlocher is assigned OpenMRS ID norman. We introduce an “identity service” subdomain (abbreviated is.openmrs.org), so content from Norman on the wiki is associated with the virtual (non-existent) email address norman@is.openmrs.org. Norman goes to reset his password on Atlassian Access…

Postfix directly references LDAP to validate the norman identifier and, if it exists, fetch the associated maildrop (i.e., Norman’s real email address) from within LDAP.

Could something like this work?

cintiadr · June 29, 2023, 11:48am

I think this could work great. And I see no reason why we couldn’t change it later as well if that’s not working as we’d like.

Keycloak + openldap + postfix could be the simplest we can get away with.

grace · July 6, 2023, 7:58pm

The CyberSec lead of DHIS2 mentioned he has found KeyCloak to be unideal. I’ve reached out to him and CC’d you all to try and understand more details. FWIW both DHIS2 and OpenLMIS are using Google LDP, and I notice that, like us, DHIS2 uses Jira and Discourse.

@jayasanka & @cintiadr apologies for the annoying late question but - have we looked into Google LDAP? Why not use that?

eg About the Secure LDAP service - Google Workspace Admin Help

jayasanka · July 7, 2023, 4:04pm

Hi @grace,

thank you for bringing this up. However, LDAP is not a replacement for an identity provider. While Google does offer an identity solution, it is expensive for us. One possible solution is to request sponsorship from Google.

In the meantime, let me explain what LDAP is used for… @burke @cintiadr please correct me if I’m wrong.

LDAP stands for Lightweight Directory Access Protocol. In simple terms, you can think of LDAP as a way to organize and access information in a directory.

Imagine you have a large library with thousands of books. Each book has a unique number and is stored on a specific shelf. Now, think of LDAP as a system that helps you find and retrieve books from this library.

In LDAP, the directory is like the library, and it stores information in a structured way. This information can be things like user accounts, email addresses, phone numbers, or any other kind of data you want to organize.

Just like each book in the library has a unique number, in LDAP, each piece of information has a unique identifier called a “Distinguished Name” (DN). The DN helps you locate and access specific information within the directory.

To find information in LDAP, you use a client, which is like a librarian who helps you search for books. You provide the client with certain criteria, such as a person’s name or email address, and it searches the directory to find a match.

LDAP also supports a querying language called “LDAP query” or “LDAP filter.” This language allows you to specify more specific search criteria, like finding all users with a specific job title or belonging to a particular department.

Once the client finds the information you’re looking for, it retrieves it and presents it to you. You can think of this as the librarian giving you the book you asked for.

LDAP is widely used in various systems, such as email servers, network authentication systems, and directory services. It provides a standardized way to store and retrieve information in a directory, making it easier for different systems to communicate and share data.

Overall, LDAP simplifies the process of organizing and accessing information in a directory, similar to how a library system helps you find books efficiently.

We can connect our LDAP to an identity provider, which will store all user information. Since LDAP is highly integratable, we can easily integrate Postfix with LDAP for email forwarding, as explained by Burke.

cintiadr · July 12, 2023, 10:37am

Everything that @jayasanka said. LDAP is just a user storage system, like a mysql database works for an OpenMRS instance.

Which LDAP implementation we use is up to us, but the most complicated ones have a cost per user. I don’t think google suite is a good option for us, particularly because we want users to self serve (instead of us manually creating a user for each person who wants).

That’s also why I don’t think okta might work for us. Auth0 would, but I’m not sure how much that would cost.

jayasanka · August 30, 2023, 6:36pm

Hello,

I wanted to share an update on the progress I have made in integrating Atlassian Cloud with our own SSO.

Let me demonstrate what I have done.

Suppose there is a user named “John” who needs to log in to the Atlassian cloud. John’s username is “john” and his email address is “john.cena@gmail.com”. However, as we require a verified domain email address, we have provided John with a virtual email address in the format @id.openmrs.org. In this case, John’s virtual email address would be john@id.openmrs.org.

John enters this email into the Atlassian cloud and clicks the login button. As we have configured the domain with our SSO, Atlassian redirects to our Keycloak instance with a SAML request. John can sign in using either his username or real email address. Once signed in, John will have successfully accessed the Atlassian cloud with his OpenMRS account. His name and email will be synced to his Atlassian cloud account, and any emails from Atlassian will be received by John at his private email address.

Well, that’s all John knows.

This is what happens behind the scenes. For ease of reference, I will break this down into two challenges.

Challenge #1: Sign in with SSO

Atlassian requires to send the user’s email address from the verified domain as the NameID of the user. According to the keycloak, we have stored his primary email address. While it is possible to store the user’s virtual email address as the email in Keycloak, doing so would disrupt other functionalities such as the “forgot email” option, registration forms, discourse integration, and potential future integrations. Keycloak allows us to use either the username or the email address as the NameID, but in this case, we cannot use the username as Atlassian only accepts the email address as the NameID.

Therefore, I wrote a small interceptor to modify the SAML response. Here’s how it works:

Keycloak generates a SAML response with the username as the NameID and signs it with its private key (in this scenario, “Private Key A”). When the Interceptor receives a SAML response from Keycloak, it first validates the signature with the public key A to confirm that the payload is genuine and has not been altered by anyone. It then modifies the payload to have the NameID with the virtual email and changes the NameID format to email.

Next, the Interceptor signs the new payload with Private Key B, which is stored within the Interceptor. It then sends the new SAML response to Atlassian. Atlassian validates the signature using the public key B stored in its configuration.

P.S. I’ve madea a slight improvement to the above flow. check it out here:

Challenge #2: Sending Emails

As per this scenario, according to the atlassian, John’s email address is john@id.openmrs.org which doesn’t exists in the real world. Whenever Atlassian wants to send an email notification to John ex: for a ticket assignment, it sends the email to his virtual email address, which ultimately fails to deliver. To resolve this issue, we need to find a way to forward these emails to John’s real email address. This is where Postfix comes in. Postfix allows us to write email forwarding rules using the following syntax:

source_email@example.com destination_email@example.org

These rules can be saved in a config file and it will forward whatever email comes to the source email.

One problem with the above approach is that the rules have to be dynamic. For instance, users may change their email address or delete their OpenMRS account. In such scenarios, updating the configuration file can be relatively difficult.

To simplify and streamline the process, I synchronized all the users with an LDAP and configured Postfix to query the LDAP (thanks to Burke’s idea).

Whenever Atlassian receives an email, postfix queries for a user where the username is the username part of the received email address. Then, it retrieves the email address of the returned user and forwards the received email. This can also be configured to forward only emails that have been received by Atlassian, preventing users from abusing their virtual email addresses.

An email sent by atlassian

Demo

You can check for a demo by using following credentials:

visit: https://openmrs-test-cloud.atlassian.net/
Enter john@jayasanka.me (or anything that ends with @jayasanka.me)
You will be redirected to keycloak
Enter username: john, password: 123123

To Create an account

Visit this link and create your account.

Source code

The source code and the docker configuration can be found here: https://github.com/jayasanka-sack/openmrs-keycloak-ldap

note: Documentation is yet to be improved. However you can refer to the tests to get an idea about the functionality of the interceptor: https://github.com/jayasanka-sack/openmrs-keycloak-ldap/tree/main/interceptor/src/__tests__

Next steps

Verify email addresses when creating an account - I attempted to configure this feature, but was unable to do so successfully. I will further look into this.
Implement a “forgot password” option - Currently, when enabled, it sends an email with a link that logs the user into the Atlassian Cloud when clicked. This needs to be fixed.
Whitelist screens.
Write a documentation for the source code
Write a custom Keycloak plugin. - The interceptor can be replaced by writing a custom plugin, I chose to write the interceptor because it seems less complicated than writing a plugin that follows the SAML guidelines and is completely independent from Keycloak versions. Keycloak doesn’t have any providers related to SAML NameID mapping. However, I recently discovered that a preview feature of Keycloak supports mapping NameID to custom user attributes. We can leverage this and write a plugin to extend the storage provider to introduce a virtual attribute to the user.

ibacher · August 30, 2023, 7:48pm

Does this process have to start from the Atlassian page or could we, for example, point issues.openmrs.org to the Keycloak server and have John login that way? I’m concerned that the entering user@id.openmrs.org isn’t terribly intuitive…

grace · August 31, 2023, 3:13am

Thank you Jayasanka for this fantastic write up! The workflow diagrams also really helped me. Agreed with Ian’s suggestion - if possible it would be great to avoid forcing users to type the “@domain” after their omrs id.

@cintiadr @dkayiwa @raff @burke please have a look through this post and let us know within a week if you have any concerns, especially anything re security. Thanks!!

jayasanka · August 31, 2023, 10:32am

@ibacher, @grace, yes, that flow is annoying. I found a static and unique URL per identity provider directory in Atlassian workspaces that we can use: https://id.atlassian.com/login/saml/start?connection=saml-089fd650-3ed0-40ad-abec-95ec22d6dae9 Using a redirection would make the flow simpler.

ibacher · August 31, 2023, 12:42pm

Cool! I’ve got another thing to ask. I’m not really clear on how we map the NameID in Keycloak, but is it possible to tie it to specific LDAP attributes? In particular, there’s a standard LDAP attribute mailAlias which would be the perfect place to store, e.g., john@id.openmrs.org without using the mail attribute. If we can map that into the SAML assertion, that’s probably easier to maintain than a Keycloak plugin.

jayasanka · August 31, 2023, 1:57pm

In this setup, the username has mapped to the NameID.

Keycloak only supports either username or email to be used as NameID.

However, one of their preview features enables us to map NameID into a user attribute. What I couldn’t find is how to save this value to the mailAlias attribute when a user is being registered.

I considered finding a way to save this value from keycloak side, but I couldn’t find a way to do it with their admin UI. It seems it requires writing a plugin to achieve it.
I also considered achieving this from LDAP’s side, I also considered having a virtual attribute within LDAP, which ended up requiring us to write a custom layer on LDAP.

What made sense to me is to leverage the preview feature which enable us to map NameID to a custom user attribute, and write a small plugin to introduce a virtual attribute to have this alias on the go.

raff · September 1, 2023, 9:22am

@jayasanka thanks for the detailed explanation on the setup!

I also notice that when I logout from atlassian, the logout action is not passed to keycloak so the next time I visit keycloak I’m automatically logged back in to atlassian. It can be considered a security risk. It’s an oauth feature to call a certain endpoint when logging out so it should be straightforward to fix.

BTW I see that e.g. hibernate managed to solve this only “approved domains” issue somehow and whenever I try to login with an e-mail from a different domain I see a message that Hibernate has approved it and allows me to use e-mail from whatever domain. They don’t seem to be using their own SSO service though. They seem to be using the “Any domains” setting as described at Control how users get access to products | Atlassian Support

raff · September 1, 2023, 10:19am

Reading up further it seems that our goal was to limit access to our JIRA only to users signed up with our SSO and not with e.g. Atlassian, Google and others, thus the issue with non approved domains. Is that right?

I think this one is interesting https://jira.atlassian.com/browse/ACCESS-1362, because we could allow users from “any domains”, but they would still be asked to register with our SSO provider. The feature is expected to be rolled out in Q1 2024. Not sure what our timeline for migration is though.

jayasanka · September 1, 2023, 4:16pm

Thanks @raff for the reply!

Yes that’s the goal. Another reason was limitations on migrating existing tickets (+ wiki). Atlassian Cloud won’t allow us to create users from unverified domains when importing our existing data as burke explained here:

In this use case, Atlassian acts as a Service Provider (SP) and OpenMRS ID (Keycloak) acts as the Identity Provider (IdP). Atlassian does not support performing a logout action on SP. The reason might be that the SAML protocol does not inherently work in this direction for Single Logout (SLO). Instead, the typical flow is that signing out from the IdP would trigger a logout from all associated SPs, ensuring consistency.

Keycloak supports the SLO mechanism, which allows for logging out from OpenMRS ID to automatically log you out from Atlassian, Talk, and all other connected services.

One mitigation to address the mentioned issue would be to limit the Keycloak session’s idle time to the least amount, such as 1 minute. What do you think?

jayasanka · September 1, 2023, 4:22pm

Update:

I’ve configured the email verification and forgot password options properly. They should work now. We can update templates too.