Extremely high load spikes - rebuild database?

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
ira
Posts: 10
Joined: Thu Feb 26, 2015 9:47 pm

Extremely high load spikes - rebuild database?

Post by ira »

Hi there,

I've got a liscensed version but I can't seem to post in the official support forum..

I'm having high load spikes.

At the start of each spike I'm getting:

Runtime Warning2015-05-08 07:01:29Warning: Host performance data file processing command '/bin/mv /usr/local/nagios/var/host-perfdata /usr/local/nagios/var/spool/xidpe/1431032481.perfdata.host' timed out after 5 seconds

output "from cat /usr/local/nagios/etc/nagios.cfg | grep 'broker'"

broker_module=/usr/local/nagios/bin/ndomod.o config_file=/usr/local/nagios/etc/ndomod.cfg
event_broker_options=-1

---



A sanitized system profile:

Nagios XI Version : 2014R2.7
nagios 2.6.32-504.12.2.el6.i686 i686
CentOS release 6.6 (Final)
Gnome is not installed
Apache Information

PHP Version: 5.3.3
Agent: Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36
Server Name: developer
Server Address: 192.168.x.x
Server Port: 80
Date/Time


Nagios XI Data

License ends in: STUORM

nagios (pid 23408) is running...
NPCD running (pid 1514).
ndo2db (pid 1592) is running...
CPU Load 15: 2.42
Total Hosts: 61
Total Services: 343
Function 'get_base_uri' returns: http://developer/nagiosxi/
Function 'get_base_url' returns: http://developer/nagiosxi/
Function 'get_backend_url(internal_call=false)' returns: http://developer/nagiosxi/includes/comp ... rofile.php
Function 'get_backend_url(internal_call=true)' returns: http://localhost/nagiosxi/backend/
Ping Test localhost

Running:
/bin/ping -c 3 localhost 2>&1
PING localhost.localdomain (127.0.0.1) 56(84) bytes of data.
64 bytes from localhost.localdomain (127.0.0.1): icmp_seq=1 ttl=64 time=0.044 ms
64 bytes from localhost.localdomain (127.0.0.1): icmp_seq=2 ttl=64 time=0.035 ms
64 bytes from localhost.localdomain (127.0.0.1): icmp_seq=3 ttl=64 time=0.039 ms

--- localhost.localdomain ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2002ms
rtt min/avg/max/mdev = 0.035/0.039/0.044/0.006 ms
Test wget To localhost

WGET From URL: http://localhost/nagiosxi/includes/components/ccm/
Running:
/usr/bin/wget http://localhost/nagiosxi/includes/components/ccm/
--2015-05-08 09:38:27-- http://localhost/nagiosxi/includes/components/ccm/
Resolving localhost... ::1, 127.0.0.1
Connecting to localhost|::1|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: "/usr/local/nagiosxi/tmp/ccm_index.tmp"

0K ........ 22.7M=0s

2015-05-08 09:38:28 (22.7 MB/s) - "/usr/local/nagiosxi/tmp/ccm_index.tmp" saved [8385]
You do not have the required permissions to view the files attached to this post.
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Extremely high load spikes - rebuild database?

Post by lmiltchev »

Have you checked to see what process has the highest CPU usage?

Code: Select all

top | head -15
Do you have any errors in the mysqld.log (crashed tables)?

Code: Select all

tail -20 /var/log/mysqld.log
Be sure to check out our Knowledgebase for helpful articles and solutions!
ira
Posts: 10
Joined: Thu Feb 26, 2015 9:47 pm

Re: Extremely high load spikes - rebuild database?

Post by ira »

mysqld.log is showing the following error:

Code: Select all

150509  7:18:51 [Warning] Disk is full writing '/tmp/SThOiQty' (Errcode: 28). Waiting for someone to free space... (Expect up to 60 secs delay for server to continue after freeing disk space)
150509  7:18:51 [Warning] Retry in 60 secs. Message reprinted in 600 secs
150510  7:17:08 [Warning] Disk is full writing './nagios/nagios_systemcommands.MYD' (Errcode: 28). Waiting for someone to free space... (Expect up to 60 secs delay for server to continue after freeing disk space)
150510  7:17:08 [Warning] Retry in 60 secs. Message reprinted in 600 secs
150511  7:17:14 [Warning] Disk is full writing './nagios/nagios_contactnotificationmethods.TMD' (Errcode: 28). Waiting for someone to free space... (Expect up to 60 secs delay for server to continue after freeing disk space)
But having a look at the disk for /tmp and the database store at"/var/lib/mysql/nagios/":

Code: Select all

root@nagios:~ $ df -P /var/lib/mysql/nagios/ | tail -1 | cut -d' ' -f 1
/dev/mapper/VolGroup-lv_root
Follow up:

Code: Select all

root@nagios:~ $ df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup-lv_root
                       27G   24G  2.5G  91% /
tmpfs                 1.3G     0  1.3G   0% /dev/shm
/dev/sda1             477M  110M  342M  25% /boot

Now there's 2.5G free on "/dev/mapper/VolGroup-lv_root", that seems like plenty of space. And it's not an inodes issue:

Code: Select all

root@nagios:~ $ df -i
Filesystem            Inodes  IUsed   IFree IUse% Mounted on
/dev/mapper/VolGroup-lv_root
                     1792752 115783 1676969    7% /
tmpfs                 185171      1  185170    1% /dev/shm
/dev/sda1             128016     64  127952    1% /boot

I'll try to see what is causing the CPU spike when it occurs next.
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Extremely high load spikes - rebuild database?

Post by lmiltchev »

How large is the database? You may need twice as much space as the size of the database. The "Disk is full" message in the log is quite clear. You will need to add more disk space.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked