demo server not responding

gcliff · December 4, 2019, 2:07pm

Hello;

demo server seems to be down

sharif · December 4, 2019, 2:21pm

Sure even i tried different web browsers but it shows 404 pageNotFound.Am not sure whether the problem may be related with this https://ci.openmrs.org/allPlans.action cc.@dkayiwa

sharif · December 5, 2019, 3:01pm

Demo server is ok now i think @gcliff you have already tested it

rishabh997 · May 19, 2020, 8:26am

The demo server has been down since 3 days… Is there any way I can reset it to make it work like a step guide… Also do i need to have some access for it… cc: @dkayiwa @ssmusoke @suthagar23

sharif · May 19, 2020, 8:29am

You can use https://qa-refapp.openmrs.org/openmrs/login.htm for meanwhile cc @cintiadr

rishabh997 · May 19, 2020, 8:35am

thanks, this will work.

cintiadr · May 19, 2020, 12:13pm

Indeed, demo was down for a long time:

http://stats.pingdom.com/lybqzkdtbtzm/2532083/2020/05

I have no idea what was the problem. I checked the logs, and everything looked fine, worked locally, there were no differences whatsoever in the logs.

So I just restarted docker daemon because I was clueless, and that seemed to have done the trick.

rishabh997 · May 27, 2020, 9:20pm

The demo server went down few hours later and it has been down since, also qa-refapp server also went down yesterday. Can you look into it @cintiadr. Thanks

cintiadr · May 28, 2020, 11:20am

I’m sorry, I’m not having the time to actually investigate. There’s something wrong with either the machine, or the hypervisor running it. Some of the instances are constantly becoming unhealthy (same issue with qa-refapp).

I disabled uat-refapp to see if that calms the thing down.

sharif · May 28, 2020, 11:31am

Thanks @cintiadr, we have been having this kind of issue especially in testing environment, we hope for best

burke · May 28, 2020, 4:19pm

Strange. I noticed demo was down with the 404 again. I hopped on the demo server and restarted the web app manually:

# docker-compose stop openmrs-referenceapplication
# docker-compose up -d openmrs-referenceapplication

and it is back up (for the moment).

I did change pingdom to test a REST API endpoint to test if OpenMRS is actually running instead of just checking if the server was responding (which even an error page would satisfy). But I can’t imagine authenticating to /ws/rest/v1/session every 30 seconds would be fatal to OpenMRS RefApp (i.e., I doubt my recent change to pingdom would be bringing down the demo server).

gcliff · May 28, 2020, 5:25pm

thanks @burke

burke · May 28, 2020, 8:35pm

Well, ironically, from a completely unrelated discussion within Microfrontends, where I shared the data model browser (om.rs/dm) and thinking it was time to update it with the data model for RefApp 2.10, I was playing with a docker stack and seeing if I could use the response from REST API call (/ws/rest/v1/session) to detect when OpenMRS was up & running to know when database changes were finished… and I recreated this bug!

HTTP Status 404 – Not Found
Type Status Report
Description The origin server did not find a current representation for the target resource or is not willing to disclose that one exists.
Apache Tomcat/7.0.94

At first I thought my HTTP API call might be the cause of the problem, since I was checking it repeatedly during startup. But I shut down the docker stack and ran it again from scratch only checking the REST API call a few times (when it first started & when OpenMRS was done setting up), and I get the same error as we see on demo.

The log output is here.

This looks like the key problem:

WARN - ModuleUtil.refreshApplicationContext(935) |2020-05-28 20:24:42,368| Unable to invoke started() method on the module's activator
java.lang.RuntimeException: failed to setup the required modules
    at org.openmrs.module.referenceapplication.ReferenceApplicationActivator.started(ReferenceApplicationActivator.java:112)
    at org.openmrs.module.ModuleUtil.refreshApplicationContext(ModuleUtil.java:927)
    at org.openmrs.module.web.WebModuleUtil.refreshWAC(WebModuleUtil.java:847)
    at org.openmrs.web.Listener.performWebStartOfModules(Listener.java:632)
    at org.openmrs.web.Listener.performWebStartOfModules(Listener.java:612)
    at org.openmrs.web.Listener.startOpenmrs(Listener.java:251)
    at org.openmrs.web.WebDaemon$1.run(WebDaemon.java:42)
Caused by: java.lang.NullPointerException
    at org.openmrs.module.referenceapplication.ReferenceApplicationActivator.mapMetadata(ReferenceApplicationActivator.java:146)
    at org.openmrs.module.referenceapplication.ReferenceApplicationActivator.setupEmrApiGlobalProperties(ReferenceApplicationActivator.java:127)
    at org.openmrs.module.referenceapplication.ReferenceApplicationActivator.started(ReferenceApplicationActivator.java:98)
    ... 6 more

So, this may be a bug (that occurs intermittently on startup?) in Reference Application 2.10.

/cc @mozzy @dkayiwa

burke · May 28, 2020, 8:42pm

I see a lot of liquibase errors in the log output. The liquibase upgrade wasn’t included in Reference Application 2.10 was it? I expect RefApp 2.10 to be using Platform 2.3, which predates those changes.

dkayiwa · May 28, 2020, 8:51pm

Are you looking at qa-refapp.openmrs.org or https://demo.openmrs.org?

burke · May 28, 2020, 9:00pm

Interestingly, I’m experiencing the same behavior as we’ve seen for demo.openmrs.org. I created a simple docker-compose stack like this:

version: "3.8"

services:
  db:
    image: mysql:5.6
    environment:
      - MYSQL_ROOT_PASSWORD=openmrs
      - MYSQL_USER=openmrs
      - MYSQL_PASSWORD=openmrs
      - MYSQL_DATABASE=openmrs
    command: "mysqld --character-set-server=utf8 --collation-server=utf8_general_ci"

  openmrs:
    image: openmrs/openmrs-reference-application-distro:2.10.0
    depends_on:
      - db
    environment:
      - DB_HOST=db
      - DB_USERNAME=openmrs
      - DB_PASSWORD=openmrs
      - DB_DATABASE=openmrs
      - DB_CREATE_TABLES=true
      - DB_AUTO_UPDATE=true
      - MODULE_WEB_ADMIN=false
      - DEBUG=false

When I initially start the stack:

docker-compose up -d

RefApp 2.10 fails to start and shows the same error we’re seeing on demo. Specifically, I’m hopping into the RefApp container and using curl to check a localhost url:

docker-compose exec openmrs
# curl -i -H "Accept: application/json" localhost:8080/openmrs/ws/rest/v1/session

If I destroy the stack and re-deploy it, I see the same behavior. However, if I wait for OpenMRS to finish starting up and showing the error (failure to start)… and then just restart the reference application:

docker-compose restart openmrs

It works. Every time. The log output after restarting OpenMRS is here.

I’m talking about demo.openmrs.org, which is running at the moment because I went onto the server and restarted the docker container like Cintia did earlier. For now, I believe reset the demo server puts it back in a broken state until someone can go on the server and restart the RefApp docker container. And I’m able to recreate the behavior locally using the docker-compose above.

dkayiwa · May 28, 2020, 9:05pm

In the log that you shared, i see this: java.sql.SQLException: Incorrect string value: '\xDA\xA9\xD8\xB3\xDB\x8C...' for column 'description'. Is the character set and collation of the database utf8 and utf8_general_ci respectively?

burke · May 28, 2020, 9:14pm

Ah. Good catch. I didn’t add that to my db as I should. But I know the demo stack does. I’m away from my laptop for the next 30-40 min, but will fix that when I get back.

burke · May 28, 2020, 10:38pm

I copied the MySQL command from the official demo docker-compose into my docker-compose (I also updated it in my earlier post just in case someone copies it). It eliminates the error you found related to ?UTF-8 (as expected) and looked like it may have fixed the problem.

I started up the stack, waited for it to completely finish and ran

curl -i -H "Accept: application/json" localhost:8080/openmrs/ws/rest/v1/session

and it worked. So, completely deleted the stack and started it from scratch again:

docker-compose down -v
docker-compose up -d

But this time I kept hitting it with my curl command every few seconds. When OpenMRS finished starting up, I was back to the 404 error. The log for this startup is here. Once again, restarting the web app docker container fixes the problem.

So, now I’m wondering again if the curl statement could be the source of the problem – i.e., hitting the REST API session endpoint during startup leads to failure? Or am I back to a random bug? I’ll keep testing.

Update: Another round of testing and I see the same behavior. Waiting for RefApp to finish starting, hitting /ws/rest/v1/session works fine. Hitting this URL while RefApp is starting ends up generating a 404 when the RefApp finishes starting and OpenMRS is non-functional until the container is restarted.

burke · May 29, 2020, 5:59am

I’ve convinced myself that I’m the reason demo.openmrs.org has been failing. In an attempt to make our uptime monitoring of demo (via pingdom) more “honest”, I changed the check from a simple ping (which succeeds whether or not the Reference Application is actually running or not) to a call to /ws/rest/v1/session, which only responds properly if the Reference Application is truly running. In doing this, I uncovered an ugly little bug that I’ve documented here:

So, I’ve changed the pingdom check to look for the login page instead (avoiding a REST API call that, if called while demo is resetting, breaks the Reference Application).