That requires me to do actual work – I’m not gonna do that right now. There’s no need right now. Startup is slow because Atlassian. The same stands for Confluence, Crowd and Bamboo.
I’m not in the mood to do it. I decided to take the week off from being responsible for any of the infrastructure. So for now, not my problem. Honestly – it only happens during Crowd LDAP syncs. Other times it’s okay.
I believe there are others in the community working on the Infrastructure team or at least familiar with the infrastructure that could take a look while @r0bby is out this week, yes? @maany, can you point Infrastructure folks to this thread to see if there is something we could do?
It’s basically @maany and myself right now. @ryan is around to help us if we get stuck but not as active as he used to be due to his day job. I’ll try to prioritize this next week. JIRA is still useable. @pascal has offered to help out as well. Just gotta on-board him, which I’d rather not do this week.
The problem is that ID dashboard shares the server with JIRA/Crowd and ID dashboard has a memory leak (no use in fixing it – gonna deploy the new version very soon) – I just watch memory utilization and bounce ID dashboard. It becomes a battle between JIRA/ID dashboard at times.
When you return from vacation, let’s take a look at what happens for on-boarding. Maybe we can make it easier for someone new to the infrastructure to be able to come on board and take on certain “tasks” or pieces without a lot of human managing. @pascal, if you do the on-boarding with @r0bby, can we work on documenting what is involved in that and what areas we might be able to improve upon for someone new to “self-learn” so to speak? This will make it easier for @r0bby and others to hand-off tasks and get a better life balance - especially for vacation!
There are things a human needs to do such as:
- granting access to the servers (yes, we tightly control this)
- Setting him up for system monitoring (aka being on-call)
This is a very human-centric part of OpenMRS – we’ve automated a lot but on-boarding new people will involve a human-being doing it.
I don’t think the slowness of JIRA is a matter of automating, but of tuning. Obviously there’s a problem because it’s slow, not only on startup but in normal use, this affects all the developers. When it crashes there’s a need for a sysadmin to check it, restart etc. Investing a few hours trying to address the problem could be a huge benefit, not only for devs but also sysadmins. Could be as simple as adding more memory to the JVM (RAM is cheap).
@lluismf, We have what we have. There’s no adding RAM. Like I said, I’ll look into it next week.
@r0bby, (for when you’re back from break)
I think what @janflowers is getting at is that we need to expand the number of people who can help address infrastructure-related issues. And “help with infrastructure issues” shouldn’t map 1:1 with being on call and handling tickets.
For example @lluismf could hopefully be convinced to dedicate a chunk of time to exploring JIRA performance/tuning issues, in coordination with a regular infrastructure team member.
This should move us in a direction where the on-call infrastructure person and initial reader of helpdesk tickets doesn’t automatically have to do everything themself, but can hand off pieces of work to people who are interested but not as regularly-engaged.
@maany is currently the point-person for JIRA issues as devtools manager.
@lluismf, I’d be willing to grant you access to the server, I just need an ssh key for you and the username you use. That’s actually really easy for me to do. DO NOT post it publicly. PM it to me.
For the moment you can give me administration privileges for JIRA and Confluence. Some diagnostics can be done from the administration options of the webapp (check database latency, activate logging …).
That can be arranged – just don’t screw things up because then I have to do actual work Log out and log back in.
It works! The system administrator page took about 5 minutes before displaying (I guess it was recompiling all the JSPs). Don’t worry, I’m just taking a look at the options - not modifying anything.
This is the current memory status:
And there are just 36 user sessions. Maybe when there’s a spike in the number of concurrent users things get worse and JIRA starts swapping to disk or gives an out of memory. Will check again it in the morning.
Also according to Jira Sizing Guide | Atlassian Support | Atlassian Documentation and assuming a small-scale installation, the recommendation is >1Gb. I’d give it a maximum of 2Gb and see what happens.
About the slowness at start … I guess it’s because the high amount of plugins and add-ons installed. But I have no idea which ones are used and which aren’t !
Yeah – I’m not even sure anymore myself what we use vs what we don’t.
The server is currently 8GB of RAM and 1GB of swap. That is shared with JIRA,Crowd,ID Dashboard. As we speak JIRA is causing the system load to spike.
If it helps I would like to be part of this. I have experience managing Jira and confluence. I’m not a expert but, I can try to help as much as possible like the others.
CPU spike or memory spike?
CPU. Memory spikes are almost always attributed to ID dashboard.
Now the heap has just 11% free memory, in any moment the garbage collector is going to kick-in and CPU will spike. I definitely think it’s a memory problem (maybe a memory leak, maybe not).
You’re more than welcome to have at it if you like