Yesterday we had an issue with our NagiosXI server: the swap file ran out of space. After that we could not recover the database for some reason and had to do a restore from backup. After some initial hickups things seemed fine after that.
Today however the monitoring engine entered the stopped state. No matter what I do or how hard I press the button, it won't start.
Every time I hit the start button the number of services on our change to some random number, usually about 640 show up: we have about 4000 services that we monitor.
Can you help us troubleshoot this issue?
Code: Select all
Nagios XI Installation Profile
System:
Nagios XI Version : 5.2.5
nagiosxi.unigarant.nl 2.6.32-573.18.1.el6.x86_64 x86_64
CentOS release 6.7 (Final)
Gnome is not installed
Apache Information
PHP Version: 5.3.3
Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:44.0) Gecko/20100101 Firefox/44.0
Server Name: nagiosxi.unigarant.nl
Server Address: 10.2.251.10
Server Port: 80
Date/Time
PHP Timezone: Europe/Amsterdam
PHP Time: Tue, 08 Mar 2016 11:10:04 +0100
System Time: Tue, 08 Mar 2016 11:10:04 +0100
Nagios XI Data
License ends in: TSPSNN
nagios is not running
NPCD running (pid 2066).
ndo2db (pid 52033) is running...
CPU Load 15: 0.19
Total Hosts: 0
Total Services: 0
Function 'get_base_uri' returns: http://nagiosxi.unigarant.nl/nagiosxi/
Function 'get_base_url' returns: http://nagiosxi.unigarant.nl/nagiosxi/
Function 'get_backend_url(internal_call=false)' returns: http://nagiosxi.unigarant.nl/nagiosxi/includes/components/profile/profile.php
Function 'get_backend_url(internal_call=true)' returns: http://localhost/nagiosxi/backend/
Ping Test localhost
Running:
/bin/ping -c 3 localhost 2>&1
PING localhost.localdomain (127.0.0.1) 56(84) bytes of data.
64 bytes from localhost.localdomain (127.0.0.1): icmp_seq=1 ttl=64 time=0.012 ms
64 bytes from localhost.localdomain (127.0.0.1): icmp_seq=2 ttl=64 time=0.018 ms
64 bytes from localhost.localdomain (127.0.0.1): icmp_seq=3 ttl=64 time=0.017 ms
--- localhost.localdomain ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2000ms
rtt min/avg/max/mdev = 0.012/0.015/0.018/0.005 ms
Test wget To localhost
WGET From URL: http://localhost/nagiosxi/includes/components/ccm/
Running:
/usr/bin/wget http://localhost/nagiosxi/includes/components/ccm/
--2016-03-08 11:10:06-- http://localhost/nagiosxi/includes/components/ccm/
Resolving localhost... ::1, 127.0.0.1
Connecting to localhost|::1|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: "/usr/local/nagiosxi/tmp/ccm_index.tmp"
0K ......... 927K=0.01s
2016-03-08 11:10:06 (927 KB/s) - "/usr/local/nagiosxi/tmp/ccm_index.tmp" saved [9836]
Network Settings
1: lo: mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: mtu 1500 qdisc mq state UP qlen 1000
link/ether 00:50:56:94:d1:56 brd ff:ff:ff:ff:ff:ff
inet 10.2.251.10/23 brd 10.2.251.255 scope global eth0
inet6 fe80::250:56ff:fe94:d156/64 scope link
valid_lft forever preferred_lft forever
10.2.250.0/23 dev eth0 proto kernel scope link src 10.2.251.10
169.254.0.0/16 dev eth0 scope link metric 1002
default via 10.2.250.2 dev eth0