High CPU/load avg

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
cbroschard
Posts: 15
Joined: Wed Apr 17, 2013 10:54 am

High CPU/load avg

Post by cbroschard »

Good morning,

We keep having an extreme load issue on our server where the load average on bootup especially, after nagios starts, will stay rather high for a good amount of time before finally settling down. As well we keep having issues where the checks stop until we restart the nagios service. I'm not sure of the root cause and was hoping for some pointers on what to check.

Thanks,

Chris Broschard
dchurch
Posts: 858
Joined: Wed Oct 07, 2020 12:46 pm
Location: Yo mama

Re: High CPU/load avg

Post by dchurch »

What process specifically is the culprit? Inspect running processes to get a list

If you PM me a system profile I can diagnose further. Get one by going to Admin (top menu) => System Profile (in the left menu), then clicking the blue button.

If you're unable to generate the the profile through the web interface, please try generating it from the command line by running these commands as root:

Code: Select all

rm -rf /usr/local/nagiosxi/var/components/profile*
/usr/local/nagiosxi/scripts/components/getprofile.sh SUPPORT
Then send me the resulting /usr/local/nagiosxi/var/components/profile.zip file.
If the profile script fails, please include the ENTIRE output.

Things you can try in the mean time:
- Tun the database repair
If you didn't get an 8% raise over the course of the pandemic, you took a pay cut.

Discussion of wages is protected speech under the National Labor Relations Act, and no employer can tell you you can't disclose your pay with your fellow employees.
cbroschard
Posts: 15
Joined: Wed Apr 17, 2013 10:54 am

Re: High CPU/load avg

Post by cbroschard »

I just pm'ed you the profile.zip file as requested - the load is mostly caused by the check processes but also nagios itself. We have the db offloaded to another server and that isn't seeing much load, just the front end web server is.
dchurch
Posts: 858
Joined: Wed Oct 07, 2020 12:46 pm
Location: Yo mama

Re: High CPU/load avg

Post by dchurch »

Looks like some host checks are timing out. Even simple ones like check_snmp are sometimes taking more than 30 seconds. This could be caused by some sort of resource starvation, e.g. network connectivity being spotty, CPU being taken up at boot time, high disk usage while the machine is booting up.

Try disabling the nagios, then rebooting and inspecting CPU usage as it's starting up.

Some other things I noticed that couldn't hurt:
- There are some Vim swap files in /usr/local/nagios/etc/static (*.sw[mnop]) - those can and probably should be removed.
- There are some Emacs swaps in /usr/local/nagios/etc (#*#) - also should be removed
(these two makes me think someone's been editing the config files instead of generating them thru Nagios XI)
If you didn't get an 8% raise over the course of the pandemic, you took a pay cut.

Discussion of wages is protected speech under the National Labor Relations Act, and no employer can tell you you can't disclose your pay with your fellow employees.
Locked