Good morning,
We keep having an extreme load issue on our server where the load average on bootup especially, after nagios starts, will stay rather high for a good amount of time before finally settling down. As well we keep having issues where the checks stop until we restart the nagios service. I'm not sure of the root cause and was hoping for some pointers on what to check.
Thanks,
Chris Broschard
High CPU/load avg
Re: High CPU/load avg
What process specifically is the culprit? Inspect running processes to get a list
If you PM me a system profile I can diagnose further. Get one by going to Admin (top menu) => System Profile (in the left menu), then clicking the blue button.
If you're unable to generate the the profile through the web interface, please try generating it from the command line by running these commands as root:
Then send me the resulting /usr/local/nagiosxi/var/components/profile.zip file.
If the profile script fails, please include the ENTIRE output.
Things you can try in the mean time:
- Tun the database repair
If you PM me a system profile I can diagnose further. Get one by going to Admin (top menu) => System Profile (in the left menu), then clicking the blue button.
If you're unable to generate the the profile through the web interface, please try generating it from the command line by running these commands as root:
Code: Select all
rm -rf /usr/local/nagiosxi/var/components/profile*
/usr/local/nagiosxi/scripts/components/getprofile.sh SUPPORTIf the profile script fails, please include the ENTIRE output.
Things you can try in the mean time:
- Tun the database repair
If you didn't get an 8% raise over the course of the pandemic, you took a pay cut.
Discussion of wages is protected speech under the National Labor Relations Act, and no employer can tell you you can't disclose your pay with your fellow employees.
Discussion of wages is protected speech under the National Labor Relations Act, and no employer can tell you you can't disclose your pay with your fellow employees.
-
cbroschard
- Posts: 15
- Joined: Wed Apr 17, 2013 10:54 am
Re: High CPU/load avg
I just pm'ed you the profile.zip file as requested - the load is mostly caused by the check processes but also nagios itself. We have the db offloaded to another server and that isn't seeing much load, just the front end web server is.
Re: High CPU/load avg
Looks like some host checks are timing out. Even simple ones like check_snmp are sometimes taking more than 30 seconds. This could be caused by some sort of resource starvation, e.g. network connectivity being spotty, CPU being taken up at boot time, high disk usage while the machine is booting up.
Try disabling the nagios, then rebooting and inspecting CPU usage as it's starting up.
Some other things I noticed that couldn't hurt:
- There are some Vim swap files in /usr/local/nagios/etc/static (*.sw[mnop]) - those can and probably should be removed.
- There are some Emacs swaps in /usr/local/nagios/etc (#*#) - also should be removed
(these two makes me think someone's been editing the config files instead of generating them thru Nagios XI)
Try disabling the nagios, then rebooting and inspecting CPU usage as it's starting up.
Some other things I noticed that couldn't hurt:
- There are some Vim swap files in /usr/local/nagios/etc/static (*.sw[mnop]) - those can and probably should be removed.
- There are some Emacs swaps in /usr/local/nagios/etc (#*#) - also should be removed
(these two makes me think someone's been editing the config files instead of generating them thru Nagios XI)
If you didn't get an 8% raise over the course of the pandemic, you took a pay cut.
Discussion of wages is protected speech under the National Labor Relations Act, and no employer can tell you you can't disclose your pay with your fellow employees.
Discussion of wages is protected speech under the National Labor Relations Act, and no employer can tell you you can't disclose your pay with your fellow employees.