idle instance OpenMRS

Tags: #<Tag:0x00007f30293504e8> #<Tag:0x00007f3029350420> #<Tag:0x00007f3029350308> #<Tag:0x00007f3029350240>

Application Name: OpenMRS Version Number: 1.9.11

Question: I have a server with enough disk space and enough RAM. On this server, I have openMRS in a docker container connected to mysql which is also in a docker container. Everything works as well. But sometimes during operation, the system becomes slow and becomes inactive without logs after 3-5 hours of usage. When you try to reach openMRS in the browser, the cursor turns endlessly without any response. During the inverstigation while openMrs is inactive, we saw that the tomcat server status application is running and displays some busy threads. This problem can occur at any time of the day. When we restart the openMrs container, the system starts working again. It should be mentioned that we have implemented a logs rotation policy on our docker-compose yml. Do you have any idea what can make the system inactive?

1 Like

So I don’t know your setup, but I’m going to be giving generic instructions to keep Java/JVM software in production.

First, I assume you are running docker on linux. If that’s not the case, that’s your problem right there.

Second, you need to check exactly what’s the JVM heap configuration looks like. If you don’t set, the default is usually not great. Make sure you know the minimum and maximum configured. Most of the java ‘performance’ problems is actually garbage collection problems.

I do expect that your CPU to be spiking on exactly one core when it ‘freezes’. If you are willing, you can hook up a lot of different observability tools to get a little bit more insight, but you need to know the basics first.

Also, you want to copy and paste your docker-compose for us to take a close look.

Chances are it’s GC (garbage collector). It’s always GC.

For example:

Hello cintiadr, Thank you for your answer. This is the content of my

export JAVA_OPTS="-Djava.awt.headless=true -XX:+UseConcMarkSweepGC -Xmx1024m -XX:PermSize=256m -XX:MaxPermSize=512m -XX:NewSize=256m"

This is also a pastebin of my docker-compose yml file :

See here some graphs of my host provided by cAdvisor

I have no idea what’s cAdvisor, but google says it’s something to check container usage.

Can you confirm to me you are running docker in linux, and not in Windows or OSX, @greenshellit? Also, please get us the output of docker info.

Would you be willing to increase the Xmx to 2g, and make the minimum the same as the maximum, and test if it makes any difference? Note that I’m not saying this is the end solution, it’s just trying to identify if it’s any area of the GC.

You also would want to get GC logs. Something like:

-XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=50M -Xloggc:/home/user/log/gc.log

What monitoring do you have in this machine? Do you have anything that exposes how much memory and CPU each process or the machine is using over time? Grafana, datadog, anything.

Also, can you confirm you are in bare metal, not in a cloud provider (like Digital Ocean or AWS)?

It would be nice to have IO throughput, to exclude the possibility of it being just stuck waiting for disk operations.

Yes we are in a cloud server which hosts the docker containers.

Ok. Let me re-ask all questions. Take your time to answer.

  1. are you running docker on linux?

  2. can you paste the output of docker info?

  3. Can you change the memory settings to 2gbs minimum and maximum and report back what happens?

  4. Can you add the GC logs and report back what happens around the time (just before and after) when the app ‘freezes’?

  5. What monitoring do you have in this machine? Can you get info about memory and CPU usage over time?

  6. How does your IO looks like? Does your cloud provider gives you metrics on IO queues or stolen CPU?