jira hasnt been very responsive for me in general but lately (~7 AM UTC) I often get a
Service Temporarily Unavailable
The server is temporarily unable to service your request due to maintenance downtime or capacity problems. Please try again later.
status.openmrs.org did show jira was up.
@r0bby have others also reported about this?
and thanks for all your work @r0bby !
jira startup is slowwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww – I got alerted it was down and bounced it. One of the many reasons I dislike it.
Do we have JIRA support? If so, they should help. It can be a CPU, RAM, disk issue etc. (even a memory leak)…
At work we do have tens of thousands of issues in JIRA and works fine.
Yeah we have two people, @maany and myself. JIRA is actually the least fun thing to work with on the server-side. I’m not dealing with this stuff this week. I need a break from being responsible for keeping the place from catching fire. Crowd has a cute bug which causes a system load spike during LDAP syncs and I haven’t gotten around to upgrading it.
I meant Atlassian support
That requires me to do actual work – I’m not gonna do that right now. There’s no need right now. Startup is slow because Atlassian. The same stands for Confluence, Crowd and Bamboo.
I’m not in the mood to do it. I decided to take the week off from being responsible for any of the infrastructure. So for now, not my problem. Honestly – it only happens during Crowd LDAP syncs. Other times it’s okay.
I believe there are others in the community working on the Infrastructure team or at least familiar with the infrastructure that could take a look while @r0bby is out this week, yes? @maany, can you point Infrastructure folks to this thread to see if there is something we could do?
It’s basically @maany and myself right now. @ryan is around to help us if we get stuck but not as active as he used to be due to his day job. I’ll try to prioritize this next week. JIRA is still useable. @pascal has offered to help out as well. Just gotta on-board him, which I’d rather not do this week.
The problem is that ID dashboard shares the server with JIRA/Crowd and ID dashboard has a memory leak (no use in fixing it – gonna deploy the new version very soon) – I just watch memory utilization and bounce ID dashboard. It becomes a battle between JIRA/ID dashboard at times.
When you return from vacation, let’s take a look at what happens for on-boarding. Maybe we can make it easier for someone new to the infrastructure to be able to come on board and take on certain “tasks” or pieces without a lot of human managing. @pascal, if you do the on-boarding with @r0bby, can we work on documenting what is involved in that and what areas we might be able to improve upon for someone new to “self-learn” so to speak? This will make it easier for @r0bby and others to hand-off tasks and get a better life balance - especially for vacation!
There are things a human needs to do such as:
- granting access to the servers (yes, we tightly control this)
- Setting him up for system monitoring (aka being on-call)
This is a very human-centric part of OpenMRS – we’ve automated a lot but on-boarding new people will involve a human-being doing it.
I don’t think the slowness of JIRA is a matter of automating, but of tuning. Obviously there’s a problem because it’s slow, not only on startup but in normal use, this affects all the developers. When it crashes there’s a need for a sysadmin to check it, restart etc. Investing a few hours trying to address the problem could be a huge benefit, not only for devs but also sysadmins. Could be as simple as adding more memory to the JVM (RAM is cheap).
@lluismf, We have what we have. There’s no adding RAM. Like I said, I’ll look into it next week.
@r0bby, (for when you’re back from break)
I think what @janflowers is getting at is that we need to expand the number of people who can help address infrastructure-related issues. And “help with infrastructure issues” shouldn’t map 1:1 with being on call and handling tickets.
For example @lluismf could hopefully be convinced to dedicate a chunk of time to exploring JIRA performance/tuning issues, in coordination with a regular infrastructure team member.
This should move us in a direction where the on-call infrastructure person and initial reader of helpdesk tickets doesn’t automatically have to do everything themself, but can hand off pieces of work to people who are interested but not as regularly-engaged.
@maany is currently the point-person for JIRA issues as devtools manager.
@lluismf, I’d be willing to grant you access to the server, I just need an ssh key for you and the username you use. That’s actually really easy for me to do. DO NOT post it publicly. PM it to me.
For the moment you can give me administration privileges for JIRA and Confluence. Some diagnostics can be done from the administration options of the webapp (check database latency, activate logging …).
That can be arranged – just don’t screw things up because then I have to do actual work Log out and log back in.
It works! The system administrator page took about 5 minutes before displaying (I guess it was recompiling all the JSPs).
Don’t worry, I’m just taking a look at the options - not modifying anything.
This is the current memory status:
And there are just 36 user sessions. Maybe when there’s a spike in the number of concurrent users things get worse and JIRA starts swapping to disk or gives an out of memory. Will check again it in the morning.
Also according to Jira Sizing Guide | Atlassian Support | Atlassian Documentation
and assuming a small-scale installation, the recommendation is >1Gb. I’d give it a maximum of 2Gb and see what happens.
About the slowness at start … I guess it’s because the high amount of plugins and add-ons installed. But I have no idea which ones are used and which aren’t !
Yeah – I’m not even sure anymore myself what we use vs what we don’t.
The server is currently 8GB of RAM and 1GB of swap. That is shared with JIRA,Crowd,ID Dashboard. As we speak JIRA is causing the system load to spike.