Management and access levels of cohorts (patient lists)

Hi all @dev5,

@ibacher @jdick and I have been brainstorming on Slack (here) about the least expensive way to implement access levels for cohorts. This post is an early proposal, please fire and let’s find a working consensus.

First of all there are 4 access levels that we have identified:

  1. View
  2. Edit (Ability to add/remove patients to/from the cohort.)
  3. Share (Ability to add/remove managing users of the cohort.)
  4. Delete (Ability to void the cohort.)

The question is how do we associate a cohort with its managing users?
We have come up with two separate cases that would be handled differently:

  1. Associating a single user;
  2. Associating a collection of users.

1. Associating a single user with a cohort - Through user properties

We are suggesting to do this through specially named user properties.
Example: user jdoe can edit and share cohort ID’d f4bca462-76d1-446b-8369-f8dab42e19b3 because the following user property exists for jdoe:

{
  key: "org.openmrs.cohort.f4bca462-76d1-446b-8369-f8dab42e19b3",
  value: "E+S"    
}

2. Associating a collection of users with a cohort - Through cohort attributes

Let’s take the user role example. Imagine the use case that all users sharing the role Covid-19 Contact Tracer can edit and delete the cohort ID’d f4bca462-76d1-446b-8369-f8dab42e19b3.

This would be done through a cohort attribute whose value would be structured like this:

{
  strategy: "org.openmrs.cohort.RoleMappingStrategy",
  entity: "a3265110-b90b-474e-8f4b-c5a8fdf4e3d1",
  access: "E+D"
}

This delegates the work to a strategy bean that takes an entity UUID argument. RoleMappingStrategy will assume that the provided UUID is that of a role, and would fetch and return all the users with that role.

If the use case was that all users logged at a certain location should view the above cohort, then we would have to come up with something like:

{
  strategy: "org.openmrs.cohort.SessionLocationMappingStrategy",
  entity: "9a377cba-dfbb-443f-887e-9f5bb04fd4ef",
  access: "V"
}

… and so on and so forth. There would be an interface to implement that would return a list of users from a UUID:

public interface MappingStrategy {

  List<User> getUsers(String uuid);

}

Both the user properties and the cohort attributes would contain all the necessary information to implement a service layer and REST API that can answer two sets of basic requests and their variations:

  1. Given a user, return all the cohorts and the user’s access levels for those cohorts.
  2. Given a cohort, return all the users and their access levels for that cohort.

Cc @corneliouzbett @jecihjoy

1 Like

I am just thinking out aloud

  1. I foresee alot of complexity handling the storage of permissions both at user level and cohort level - IMO all permissions should be stored on the cohort definition

  2. I would suggest not having user permissions at all, rather to this through Roles, which will simplify management

  3. For the keys, I would not keep E+S, but rather bits for each permissions something like below

{
  key: "org.openmrs.cohort.f4bca462-76d1-446b-8369-f8dab42e19b3",
  role: "a3265110-b90b-474e-8f4b-c5a8fdf4e3d1"
  permisions: {
    "E":"true",
    "V":"true",
     "D":"false",
     "S":"false"
  }
}
  1. Following above if not specified, then false is assumed (principle of least privilege)
  2. This will probably remove the need for the share permission - which requires managing users at the role level and abstracts that away
  3. I know this is for patient lists, but can this be extended to manage patient records too, Obs etc (I know it is a stretch), but why not
  4. Leveraging UNIX permissions can we have - owner + group + staff to map to Cohort creator, group (role), staff (everyone else) and by default all Cohorts are visible only to the owner (principle of least privilege)
  5. Using UNIX again with role based access we only need 3 sets of actions, read (View), write (Edit), execute (Delete) - though would be part of write

This would be abstracted away by a single API living in openmrs-module-cohort. So yes under the hood there is some complexity because

  • single user associations are maintained at the user level, and
  • user collections associations are maintained at the cohort level,

but at the end of the day the API would obfuscate that.

IMO the complexity rather comes from the fact that a mechanism to prevent users to meddle with their org.openmrs.cohort.* properties will have to be put in place, yet again in openmrs-module-cohort. Doable, but one more thing to do.

Assuming that there is a use case for a user to share a list with other users that are picked one by one, regardless of any other factors (such as their roles for example), then this requirement must be addressed.

Many systems allow both. Think of Jira for instance, access can be granted on a per user basis and on a per user group basis (and also on a per role basis, known as project role in Jira). That is actually the proper way to go IMO. However for the time being we are shying away from modelling user groups as it is a nice to have and could be addressed later.

The suggested approach goes even beyond this, since it provides a mechanism for single users on one hand and for any possible way to define a collection of users on the other hand (through those mapping strategies). This idea really stems from the two use cases that have been presented to me: 1) the collection of users having a given role and 2) the collection of users operating at a certain location. From there this can extended at will to any way to define a collection of users, they just need to share a common characteristic that the corresponding mapping strategy would use to fetch the users.

If there is a resounding voice to say that the only useful requirement is to associate cohorts with roles, then so be it, and this greatly simplifies the problem. From a functional standpoint this will come as a pretty big limitation IMO, this removes a lot of granularity when sharing cohorts and would be at odds with typical patterns known out there (Un*x systems, Jira, Google Drive…).
My understanding is that it is required to pick the users to share a cohort with, and even that the role-based association was coming second, or at best was equally required. @jdick?


I don’t mean to ignore the other points but they delve into details that can be addressed later. All code is presented as pseudo-code to show examples of how things could be done in principle (not how they should be done or named in practise.)

Except by the way… for expanding this to anything else (Obs, … etc), that is definitely out of scope and there are workarounds (the Data Filter module for example).

Random drive-by comment, since I happened to open talk today…

I have the same reaction as Steven just storing this data in two places. It adds implementation complexity, and it guarantees your query performance is the slower of querying all users and querying all cohorts.

(I guess there is some reason why you are not considering just storing the data in a table.)

PS- I agree that Roles are wrong, those are meant to group permissions. And I also think letting users share a cohort with an ad hoc list of users gives more MVP value than sharing with “named lists” of users.

To give some background, the original requirement leading to using user properties was just that: associating single users with a cohort. And that was “fine”, performance wise as well I think.

The mapping strategy idea was added after when uses cases like “all users sharing a role should be associated with the cohort”, or “all users logged at the same location should be associated with the cohort”… etc.

What I thought from the outset is that we’d need user groups to manipulate collections of users, but we do not have the space to go into that at this point. However if they existed, then they could be defined off any criteria:

  • Let’s build a group of users sharing a role, and associate that collection with a cohort.
  • Let’s build a group of users based on xyz criterion, and associate that collection with a cohort,

Again, we need precise use cases to adopt a design that is appropriate.

The more expensive way would be to bring up a new user_access table (staged in openmrs-module-cohort?) Something holding such records:

{
  user_id: 123,
  is_group: "false",
  entity_id: 987,
  entity_type: "org.openmrs.module.cohort.CohortM",
  access: "E+S"
}

(On a side note, where is this naming is coming from?? “CohortM”?)

@ssmusoke this kind of table could eventually be used to handle any kind of access by specifying other entity types than CohortM.

Later on we would be able to have entries where is_group: "true", meaning that the user_id would not point to a org.openmrs.User but to a org.openmrs.UserGroup that is yet to be introduced.

Alternatively a second user_group_access table could be added specifically for groups.

All the use cases discussed above (“all users sharing a role”, “all users logged in at a certain location”, … etc) would be abstracted behind user groups through mechanisms that we would develop when introducing the new UserGroup entity. For instance there could be plain groups where the member users are added and removed manually, and virtual groups where a strategy fills the group dynamically based on a business rule (such as “all users sharing a role” or “all users logged in at a certain location”).

Though it can technically work, using user properties for this makes the design look complicated and hacky. I believe that user properties were intended for simpler values that do not carry as much meaning as for this use case. :slight_smile:

Using dedicated tables looks simpler and more logical/appropriate. I do not even think that it would be more expensive (assuming expensive means performance). Although it is not bad to generally think about the performance of the proposed solution, i would not let it get in the way of a simpler design until when i actually prove, via tests or other means, that the performance is not acceptable.

To make it even simpler and take advantage of automatic database referential integrity checks, i would use separate tables for users and groups. Though using the entity_id and entity_type fields makes it more generic, i would not do it. Instead, i would let these tables deal with only cohorts and put them strictly in the cohort module. The generic entity thing will make more sense later on if this works out well in the cohort module and we find ourselves needing to move it to core and generalise it for other entities besides cohorts.

I would also ensure that this access data storage structure does not leak in the API, such that the clients don’t get affected, in case we change it later on.

My initial reaction is similar to @dkayiwa and others - that we’d be better off with dedicated tables designed with this in mind. I’m more reluctant about using user properties for this than cohort attributes, as it’s not clear to me that we’d want to allow users to be able to maintain their own levels of cohort access and user properties could definitely be used to store additional user information that one might expect to be self-maintained on the “My Account” or “My Profile” page within the application.

@mksd I think you will end up in the same area as Roles - so I would suggest extending that functionality without introducing something new

You can even do something like, whenever a Cohort is created a Role is automatically created with it, to enable users to be added to it

In more complex systems you have hundreds of roles building on each other - which is the simplest way to go since the purpose of UserGroup is access control which the Roles can do

IMO this will lead to the next iteration of roles which is a frequently asked feature for OpenMRS which is restricting access to a user’s medical record to certain users only (this will increase complexity of the Cohort access permissions)

@ssmusoke I would echo @darius that there is a confusion between roles and user groups, and I think we should get this right. A role is not a collection of users, it’s a collection of privileges. However those roles being distributed to users make us think of them as collections of users, but this is a second degree effect.

Personally I believe that if we want to get things right and avoid making contorsions in the future, we should at least anticipate that a new UserGroup entity will exist one day.

@mseaton I personally would prefer a dedicated table then, rather than using cohort attributes, because this could become the seed (yet limited to cohorts for now as @dkayiwa suggested) for a more general access level management pattern that could be expanded to other entities.


@dkayiwa by inexpensive I meant “as little work as possible / out of the box”. When I was asked if there was a way to store a single user access to a cohort with the existing data model and API, I said that yes this could be done through user properties right away.

After multiple community discussions today (TAC and the O3 design call), we have settled on introducing a generic entity access table that will hold such records:

{
  principal_id: 123,
  principal_type: "org.openmrs.Role",
  resource_id : 987,
  resource_type: "org.openmrs.module.cohort.CohortM",
  access: "E+S"
}

This principal-to-resource mapping table will allow to 1) cover every possible scenario and 2) prepare us to fully adopt Spring Security (@ibacher to expand on this after his spike). Do not get bogged down by the details, such as the access string format, all this can be adapted to be more readable.

While the principal’s type can be anything, we will likely only use it with org.openmrs.User and org.openmrs.Role for the time being.


@burke @ibacher I still think that in the general case (so when the principal is not just a User) it is not sufficient to provide a reference to the principal and a reference to the resource, it is also necessary to specify a strategy to go from the principal to a collection of users. It is obvious what that one would do but in the above example there should be a RoleMappingStrategy, something like that:

class RoleMappingStrategy implements MappingStrategy<Role> {

  public List<User> getUsers(Role principal) {
    // returns all the users with the role `principal`
  }

}

Of course there is a lot to be scrapped for a first stab since we will likely limit ourselves to supporting User and Role principal types, and therefore we may just hardcode two mapping strategies without putting together a more complicated architecture.

I would just add that “generic entity access” is a tricky topic with lots of prior art, presumably including academic papers, and OpenMRS is not trying to do anything special that hundreds of systems haven’t done before. So, +1 for plugging into existing tooling for this, and it would be even more worthwhile for someone to search the literature more than usual (because this is more of a “typical” problem than what OpenMRS usually faces).

If you apply this wider than just cohorts, it probably eats up a huge amount of computational power, and there are lots of implementation details around caching.

In that spirit, @mksd, for your “logged in at location” example, my inclination is:

  • authorizing a static user, role, or user group, should be the common case, and optimize for these. Particularly, don’t force them through a Strategy pattern that assumes arbitrary code.
  • instead, support some kind of “dynamic query” use case, but try to make it a less-common one, and assume it’s going to be less performant.
1 Like

Thanks @darius for your input, it all resonates with what’s been discussed. Let’s hope that there is a reasonable path to using Spring Security tooling, hopefully leading to a reasonable path down the line for using Spring Security altogether within OpenMRS.

That makes a lot of sense, and I think is aligned with what I had in mind.