GSOC 2019: OpenMRS Atlas 3.1

Tags: #<Tag:0x00007f29998cb1c0>

Same here. I think we may have a clue.

I made the requested changes for ATLAS-168. :slight_smile: I also added the tables in the PR to atlas-stg for testing.

Merged!

Are we ready to request others to test atlas-stg? Can anyone log into atlas-stg or is it connected to a staging ldap? If authentication is limited, perhaps we could make a test account (even temporarily) for people to use in testing. I was thinking we could direct them to a feedback link like this. :slight_smile:

I may have messed up the PR a little. Some remnants from the previous version of the plan of the ticket remained in the PR at the time of merging. I didn’t notice them until after it got merged. I’ll clear them up in a few min, and let you know.

Fixed! Sorry about that. I guess we’re ready to request some testing. :slight_smile: A friend of mine could log into atlas-stg, so I think anyone may be able to login.

Awesome! Try hopping into our IRC channel or Slack and see if you can convince a few people to try out atlas-stg.openmrs.org, kick the tires, and give you some feedback. Let 'em know they can turn off fading in the menu and, if they happen to have an old marker on atlas-stg, they should still be able to edit it. Some specific folks (you may not find them on IRC or Slack but they might chime in here or you could try sending them a DM in Talk) who might be interested in trying it out: @ball, @akanter, and @ahabib. Others who I’m sure would be happy to help try it out : @jennifer, @dkayiwa, and @c.antwi.

Be respectful of their time and feel free to throw my name in if it helps (e.g., Burke and I were hoping if you get a chance, you could browse to atlas-stg.openmrs.org, edit an old marker or add a new one, and let us know if you see any issues). Also, make sure they know to test out atlas-stg.openmrs.org (not atlas.openmrs.org). :slight_smile:

FYI – I restored the date_changed values on atlas-stg, so it has faded markers again. For anyone with an existing, faded marker, this will give them the thrill of clicking on the update link and seeing their marker come back to life. :+1:

If you can mange to get 3-5 or more people to test it out without significant issues, we’ll promote it to production and make an announcement to the community that Altas is back, baby!

1 Like

@burke @harsha89 @cintiadr I created a talk post for it and also posted it on IRC. :slight_smile:

While testing continues, I’d like to start working on the Atlas module. According to my understanding, this is how things should work:

  • The module sends a POST request to the server, containing a uuid, which is the module’s id.
  • If this id isn’t found in some “module” table, we will store this id in the table.
  • If there are no errors, the server will indicate it, the module starts running, and the server session is now in “module mode”.
  • Any marker created from within this module mode will also have an auth entry with the module’s id as the token.

Some questions I have are:

  • How do we know that the POST containing the uuid actually comes from a module?
  • If multiple sites were created in module mode in the same openmrs instance, do we update all these sites with the same data?

The atlas module never creates a marker; rather, it is entrusted to send updates during the creation of a marker by a user. Also, the module can only be linked to a single marker, since the purpose of the marker is to represent a single OpenMRS server (i.e., a marker on the atlas for the server it’s running on).

This is based on some dusty recollection, so don’t be surprised if you find discrepancies between my description here and the code.

In its first iteration, the module was responsible for creating the marker, but as we improved the Atlas server, we realized that 90% of the marker management (even using the module) could be done through the server’s web interface with a much lighter interaction with the module. Now the module loads the Atlas server web page in an iframe, passing a parameter to let the server know it’s running inside the module.

  • The user authenticates to Atlas server (the usual way) inside the iframe
  • In module mode, the server allows the user either to link an existing marker or create a marker (only one) that is linked to the OpenMRS instance (i.e., the module)
  • When the user either links an existing marker or creates a new one, the module server sends the server a unique token (a uuid). We used to use this as the uuid for the marker (which is why we were hiding them). Now, this token sent from the module would be stored as the auth token. It’s a “secret” that only the module & server know, which is how the server can trust a ping from the module (it’s sent with the auth token… ideally in the body of a POST and not in the URL, which risks exposing the secret).
  • When linking a marker to the module (whether an existing marker or a newly created marker), the user is given the chance to opt-out of sending additional data to the server (e.g., patient/encounter/obs counts, server info, running modules & versions).

Once the module is linked to a marker, all the module does is issue a weekly ping that updates the marker (keeps it from fading) and optionally send the additional data. The current module probably does an HTTP GET with the module’s UUID and an extra secret phrase (SESSION_SECRET). We’d want to switch this to a POST with the module’s UUID (as an auth token) inside the body (not exposed in the URL) and we could get rid of the SESSION_SECRET since it’s not needed.

1 Like

Here’s how I could link the module with the Atlas server. I assume you already have the Atlas server running.

  • Clone the openmrs-module-atlas repo.
  • Go to AtlasConstants.java, and change the value of SERVER_URL to the URL of your local server. You could also change SERVER_PING_URL to redirect pings to your server, but the ping route doesn’t really work.
  • Build the module using mvn clean install, load it into OpenMRS, and open the ‘Manage Atlas Markers’ page to see your module’s iframe connected to your server.

@heliostrike,

Some more thoughts on the API for the module…

When the module first runs, it should auto-generate and save an internal uuid that will serve as its token for the Atlas. This is the secret we don’t want to expose in urls and what we’ll store in the auth table for a module.

Module start with:

POST /module
 {
  token: "module-generated-uuid"
}

and server responds with either:

HTTP 200 OK
[
  { /* linked marker here */ }
]

or, if no marker has been linked:

HTTP 200 OK
[ ]

Then module loads iframe with src /?module=true (while we could pass the linked marker id here, let’s not because the user can change the linked marker while interacting with the iframe). The following behavior applies only to Atlas Server when it’s in this “marker mode.”

If a user isn’t authenticated, the OpenMRS Atlas redirects them to /login.

When the Atlas website finishes loading in module mode and a user is authenticated, it would ask the module (through the iframe) for it’s linked marker and, if one exists, it would set focus to the linked marker (just as if we loaded the atlas website with /?id={id} for that marker).

I believe this code will already show a “This is not me” link on a linked marker and show a “This is me” link on any other markers the authenticated user can edit, which is the behavior we want.

If a marker is linked, then the user shouldn’t see the “Create a marker” option in the menu.

If we haven’t yet linked a marker, the user would see the “Create a marker” option and creating a marker would automatically link it to the module.

Linking a marker to the module should store the module’s token in the auth table for that marker. Likewise, unlinking a marker from the module should remove this auth rule to ensure a module is never linked to more than one marker. In essence, the auth rule represents the link between module & marker.

1 Like

I started working on this issue. I could connect the module to the atlas server using a POST, and save the module-id and marker data on the module page session, but the module page and iframe are running on different sessions. We’ll need the module-id and marker-id in the iframe session to get things running. I’ve been searching for a solution for the last hour but I think I’m stuck. How do we solve this?

Do you have communication between the iframe and parent set up (e.g., using cross-document messaging)?

Glancing at the module source, I see it uses DWR, which was deprecated years ago. Ugh.

1 Like

I’m unfamiliar with cross-document messaging, will have to look it up. Thanks for the lead. :slight_smile:

“Node js sharing sessions across domains” I’ve been googling the wrong thing the whole day!

Yes. The messaging will all happen client side via javascript – e.g., when Atlas finishes loading (within the iframe in “module mode”) it fires an event to notify the parent (module’s web page), which can turn around and provide the info needed.

Unfortunately, the web pages for the module (unless there’s a branch other than master with newer code) appears to be legacy code and may need to be rewritten for newer versions of OpenMRS.

1 Like

Sorry about the late update. I made 2 PRs to make the atlas module work with atlas.

Atlas PR: Created module mode. ‘Module mode’ is stored in the session. When its switched on, the iframe expects the module ID and ‘has_site’ from the parent window, and activates module mode. Here users can attach or detach modules and markers.

Atlas Module PR: Setup cross-window communication. When sending data is enabled, sends POST requests to update markers.

If you’d like to test the atlas module, change the server url here and maybe reduce the POST data interval here to test whether POST is working properly.

While the atlas module PRs are in the oven, I’d like to take up the search for markers ticket. I was curious about how we’d search for markers, do we return markers where the search phrase is a substring of a marker’s name, or where its the prefix of the name, or is there any other criteria we need to consider? What do you think?

I’ve made PRs on most of the new tickets created but I have a few questions about a few of them.

  • ATLAS-206 - Would we want to create a GET to retrieve details about usage of all modules? I think it would help with ATLAS-207.

  • ATLAS-203 - It looks like most of the console.log(error);s are being used to log mysql query errors. What should I do in these cases?

I also updated the module PRs (ATLAS-194, ATLAS-195) to work with each other, and created a PR for searching markers. They are unrefined, but I think the project objectives are almost complete. :slight_smile:

1 Like

Yes. The intent is to provide a resource through the API to deliver the data needed to produce reports/graphs in the client (browser). If in the future we need to protect the server from servicing too many calls, we could cache the data (either in an internal table or, more likely, just using nginx caching for the resource).

The main point is to inform the client/user when an error has occurred. If there are SQL failures, then “Unable to retrieve xyz from the database” or simply “An unexpected database error occurred. Please ask an administrator to check atlas logs at YYYY-DD-MM HH:MM:SS for more information” may suffice. What currently happens if you force one of those SQL calls to fail? I’m assuming that the http request will either hang & timeout, fail with some default express failure message, or – worst case – return as 200 OK without the expected result. I haven’t looked at them all, but wouldn’t be surprised if there are situations where subsequent code would be executed inappropriately because we don’t return after logging these errors and many are nested several layers in if/then/else conditions where it might not be able to see whether the console log line will be the last line executed in the method.

1 Like

Updated ATLAS-206. :slight_smile:

I updated the PR with samples of the being JSON returned, are they ok?

Aye! I was just a little confused about the "A useful message that would allow the client to understand what exactly failed and what they might do to fix it" bit. I thought we were supposed to check for all possible input errors the user migth’ve made (if statement bad-place). I also updated ATLAS-202. I made a primitive commit for the former ticket but it’d be easier to make a PR and avoid conflicts once ATLAS-202 gets merged as all the console.logs change to logger.somethings.

I got a mail recently about GSoC, mentioning that I may need to eventually submit a project page explaining my work. What do you think I should submit then, this talk page, the project page(I’m assuming its this one) or perhaps my blog?

We will want the API to supply data for a few different reports in the future, so don’t use /api/report as a resource itself. Rather, use /api/report/modules as the resource for getting this “modules” report data.

In the JSON, don’t use module names as keys. Instead of entries like:

 "Reporting": {
    "versions": {
      "1.17.0": 3
    }
  }

format them like:

  {
    "id": "reporting",
    "name": "Reporting",
    "versions": {
      "1.17.0": 3
    }
  }

We want these to be an array of modules, so the first & last characters would be [ and ].

FYI – You should only be including the “active” modules in these counts.

I wouldn’t use a path like GET /api/report/Atlas Module. First off, it’s not a valid path. You’d have to use something like GET /api/report/Atlas+Module or GET /api/report/Atlas%20Module to make it valid. Also, it’s better to use IDs instead of names for identifying resources. And, as I mentioned earlier, I’d like the resources under /api/report/* to represent different reports. We’ll start with modules, but we could have other reports like “activity” to report on Atlas marker activity (though we have a nifty RSS feed for that), “usage” to provide some data on Atlas website usage, etc. (I’m just making these up as examples… for now, all we need is a “modules” report). If we want to support filtering to a specific module, we could either use GET /api/report/module/:module_id (i.e., GET /api/report/module/reporting) or GET /api/report/module?id=:module_id. I think the first one GET /api/report/module/:module_id is more RESTful, especially since the “module” resource would return an array of results and adding the module ID to the path would filter to that one result. But filtering to a specific module isn’t really needed, since it will actually be easier on the server to just provide the full set of data all at once rather than handling individual requests for each module.

I would suggest using the short link to your wiki project page – i.e., https://wiki.openmrs.org/x/I4DYCg, since that page should be updated and contain links to your blog and this discussion.

By the way, I tried deploying some or your latest changes to staging and ran into a couple problems. First, the addition of a google analytics tracking id and our change to the API path requiring an update to the health check in docker-compose means we need to redeploy docker-compose for atlas. I reached out to @cintiadr on our telegram infra channel to ask how to do this properly. Also, atlas on staging is now crashing whenever you try to log in. It appears to be reporting a certificate error as it tries to authenticate via ldap:

Error: certificate has expired
    at TLSSocket.<anonymous> (_tls_wrap.js:1116:38)
    at emitNone (events.js:106:13)
    at TLSSocket.emit (events.js:208:7)
    at TLSSocket._finishInit (_tls_wrap.js:643:8)
    at TLSWrap.ssl.onhandshakedone (_tls_wrap.js:473:38)

I don’t know if this is a bug we’ve introduced or if there’s something misconfigured with ldap staging. https://ldap-stg.openmrs.org/ gives a nginx gateway error, but it may always have done that (since we are using ldaps & not https) and, despite that error, the TLS certificate for ldap-stg doesn’t appear to have expired.

It’s nearly 2am here and I’m losing consciousness involuntarily, so will look to you and/or @cintia to lend a hand or I’ll try to get this sorted tomorrow or over the weekend.

Update: I did a little more digging, was able to recreate the ldap authentication failure locally, found this tip to add userClient.on('error', function(err) { ... }); to capture the specific ldap error, which looks like this:

{ Error: getaddrinfo ENOTFOUND openldap openldap:389
    at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:56:26)
  errno: 'ENOTFOUND',
  code: 'ENOTFOUND',
  syscall: 'getaddrinfo',
  hostname: 'openldap',
  host: 'openldap',
  port: 389
}

Which, unfortunately, doesn’t immediately point me to a solution. It looks as if some new defect in atlas is keeping it from finding the ldap server. Hopefully, @heliostrike, you can recreate the error yourself and maybe figure out what we did to introduce this problem.

Update 2 (now I really need some sleep): it looks like my local issues were related to a change to networks @cintiadr made to the local docker-compose for atlas. When I removed her network changes, my local version of atlas worked again. I’m not sure (but suspicious) if those changes might also be preventing atlas-stg from seeing ldap.

1 Like

I forgot that json header could return arrays, which is why changed the API that way (._. ; ). I blame the night for that! I updated ATLAS-206 accordingly and updated the sample responses. :slight_smile:

My expertise on troubleshooting server errors doesn’t extend much beyond google searches, but I’ll take a look around.

Updates:

  • Googling didn’t give me any pointers I could use. Deploying a previous commit to staging is taking an unusual amount of time. Usually it takes about 30s but now its been running for 10 min with the following log repeating.
09-Aug-2019 10:07:07	Verifying if docker containers 134cfa1b757929688c113119ee9fe11ee62312a0dc817068cbf90127d7c52174
09-Aug-2019 10:07:07	cc643a1cbab3cf207dd747515632f60c73179d52fb0a08581ee27559f14d6c6a
09-Aug-2019 10:07:07	6a03dc245df60a7b1a4424bc741a4d51d4744fc682ca0d0c33befb9842d75e22 are healthy
09-Aug-2019 10:07:07	Status found: "healthy" "unhealthy"

Finally, the deployment failed. :confused:

  • @cintiadr suggested to destroy and start ldap-stg, as its certificate expired.