Performance Graphs stopped working
Posted: Mon Oct 27, 2014 3:28 pm
Hi,
We are having trouble with our performance graphing. Data for the graphs quit working about 6 days ago. The graphs show up, and they display history data, but the data just ends there, at around 6 days ago.
I did notice this in the process list on the machine:
nagios 7365 84.5 0.0 122624 2232 ? R Oct20 8724:20 /usr/bin/perl /usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1413819918.perfdata.service
Is that hung? Should it be killed perhaps?
Here are some more details:
We are running with:
Nagios XI 2014r1.0
Centos 6.5 64 bit
Manual XI install
Running SSL
We are having trouble with our performance graphing. Data for the graphs quit working about 6 days ago. The graphs show up, and they display history data, but the data just ends there, at around 6 days ago.
I did notice this in the process list on the machine:
nagios 7365 84.5 0.0 122624 2232 ? R Oct20 8724:20 /usr/bin/perl /usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1413819918.perfdata.service
Is that hung? Should it be killed perhaps?
Here are some more details:
Code: Select all
ll /usr/local/nagios/share/perfdata/Code: Select all
total 308
drwxrwxr-x 2 nagios nagios 4096 May 27 08:37 10.47.157.6
drwxrwxr-x 2 nagios nagios 4096 May 27 10:02 10.47.157.7
drwxrwxr-x 2 nagios nagios 4096 Jun 19 12:05 10.87.154.69
drwxrwxr-x 2 nagios nagios 4096 Oct 20 10:44 172-31-1-0--10-164-128-0--tunnel
drwxrwxr-x 2 nagios nagios 4096 Jul 15 10:20 172.31.1.14
drwxrwxr-x 2 nagios nagios 4096 May 27 09:50 Awards-Catalog
drwxrwxr-x 2 nagios nagios 4096 Jun 2 14:12 Awards-Catalog-1
drwxrwxr-x 2 nagios nagios 4096 Jun 2 14:14 Awards-Catalog-2
drwxrwxr-x 2 nagios nagios 4096 Jul 7 12:01 kgpprodweb32
drwxrwxr-x 2 nagios nagios 4096 Jul 29 09:22 kgpprodweb47
drwxrwxr-x 2 nagios nagios 4096 Aug 4 13:38 kgprodweb30
drwxrwxr-x 2 nagios nagios 4096 May 29 09:30 kgprodweb31
drwxrwxr-x 2 nagios nagios 4096 Oct 20 10:45 kgprodweb46
drwxrwxr-x 2 nagios nagios 4096 Oct 20 10:45 kgprodweb47
drwxrwxr-x 2 nagios nagios 4096 Oct 20 10:45 kgprodweb48
drwxrwxr-x 2 nagios nagios 4096 Oct 20 10:45 kgprodwww01
drwxrwxr-x 2 nagios nagios 4096 Oct 20 10:45 kgprodwww02
drwxrwxr-x 2 nagios nagios 4096 Sep 3 11:29 VPN_Tunnel
Code: Select all
tail -25 /usr/local/nagios/var/perfdata.logCode: Select all
[root@usawepvl011 xidpe]# tail -25 /usr/local/nagios/var/perfdata.log
2014-04-29 15:20:14 [21316] [0] *** TIMEOUT: Timeout after 5 secs. ***
2014-04-29 15:20:14 [21316] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2014-04-29 15:20:14 [21316] [0] *** TIMEOUT: Please check your npcd.cfg
2014-04-29 15:20:14 [21316] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1398802784.perfdata.host-PID-21316 deleted
2014-04-29 15:20:14 [21316] [0] *** Timeout while processing Host: "lglproddb01" Service: "_HOST_"
2014-04-29 15:20:14 [21316] [0] *** process_perfdata.pl terminated on signal ALRM
2014-04-29 15:20:14 [21315] [0] *** TIMEOUT: Timeout after 5 secs. ***
2014-04-29 15:20:14 [21315] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2014-04-29 15:20:14 [21315] [0] *** TIMEOUT: Please check your npcd.cfg
2014-04-29 15:20:14 [21315] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1398802784.perfdata.service-PID-21315 deleted
2014-04-29 15:20:14 [21315] [0] *** Timeout while processing Host: "lglprodweb01" Service: "_dev_xvde_Disk_Usage"
2014-04-29 15:20:14 [21315] [0] *** process_perfdata.pl terminated on signal ALRM
2014-04-29 15:23:37 [21957] [0] *** TIMEOUT: Timeout after 5 secs. ***
2014-04-29 15:23:37 [21957] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2014-04-29 15:23:37 [21957] [0] *** TIMEOUT: Please check your npcd.cfg
2014-04-29 15:23:37 [21957] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1398802934.perfdata.host-PID-21957 deleted
2014-04-29 15:23:37 [21957] [0] *** Timeout while processing Host: "" Service: ""
2014-04-29 15:23:37 [21957] [0] *** process_perfdata.pl terminated on signal ALRM
Code: Select all
tail -25 /usr/local/nagios/var/npcd.logCode: Select all
[09-03-2014 14:32:12] NPCD: WARN: MAX load reached: load 12.830000/10.000000 at i=1
[09-03-2014 14:32:27] NPCD: WARN: MAX load reached: load 11.890000/10.000000 at i=1
[09-03-2014 14:33:13] NPCD: WARN: MAX load reached: load 10.850000/10.000000 at i=37
[09-09-2014 22:53:12] NPCD: WARN: MAX load reached: load 10.590000/10.000000 at i=0
[09-09-2014 22:53:27] NPCD: WARN: MAX load reached: load 13.550000/10.000000 at i=1
[09-09-2014 22:53:42] NPCD: WARN: MAX load reached: load 14.850000/10.000000 at i=1
[09-09-2014 22:53:57] NPCD: WARN: MAX load reached: load 13.720000/10.000000 at i=1
[09-09-2014 22:54:12] NPCD: WARN: MAX load reached: load 15.220000/10.000000 at i=1
[09-09-2014 22:54:27] NPCD: WARN: MAX load reached: load 16.860000/10.000000 at i=1
[09-09-2014 22:54:42] NPCD: WARN: MAX load reached: load 16.480000/10.000000 at i=1
[09-09-2014 22:54:57] NPCD: WARN: MAX load reached: load 14.660000/10.000000 at i=1
[09-09-2014 22:55:12] NPCD: WARN: MAX load reached: load 16.660000/10.000000 at i=1
[09-09-2014 22:55:27] NPCD: WARN: MAX load reached: load 17.560000/10.000000 at i=1
[09-09-2014 22:55:42] NPCD: WARN: MAX load reached: load 17.410000/10.000000 at i=1
[09-09-2014 22:55:57] NPCD: WARN: MAX load reached: load 15.900000/10.000000 at i=1
[09-09-2014 22:56:12] NPCD: WARN: MAX load reached: load 17.900000/10.000000 at i=1
[09-09-2014 22:56:27] NPCD: WARN: MAX load reached: load 16.240000/10.000000 at i=1
[09-09-2014 22:56:42] NPCD: WARN: MAX load reached: load 16.470000/10.000000 at i=1
[09-09-2014 22:56:57] NPCD: WARN: MAX load reached: load 13.960000/10.000000 at i=1
[09-09-2014 22:57:12] NPCD: WARN: MAX load reached: load 12.690000/10.000000 at i=1
[09-09-2014 22:57:27] NPCD: WARN: MAX load reached: load 11.100000/10.000000 at i=1
[09-27-2014 01:53:30] NPCD: Caught Termination Signal - Hasta la vista... baby
[09-27-2014 02:09:42] NPCD: npcd Daemon (0.4.14) started with PID=1279
[09-27-2014 02:09:42] NPCD: Please have a look at 'npcd -V' to get license information
[09-27-2014 02:09:42] NPCD: HINT: load_threshold is enabled - ('10.000000')
We are running with:
Nagios XI 2014r1.0
Centos 6.5 64 bit
Manual XI install
Running SSL
Code: Select all
System:
Nagios XI Version : 2014R1.0
usawepvl011.ficticious.com 2.6.32-431.11.2.el6.x86_64 x86_64
CentOS release 6.5 (Final)
Gnome is not installed
Apache Information
PHP Version: 5.3.3
Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:32.0) Gecko/20100101 Firefox/32.0
Server Name: nagios.ficticious.com
Server Address: 10.164.130.252
Server Port: 443
Date/Time
PHP Timezone: America/Chicago
PHP Time: Mon, 27 Oct 2014 15:14:37 -0500
System Time: Mon, 27 Oct 2014 15:14:37 -0500
Nagios XI Data
License ends in: MTSVNN
nagios (pid 23514) is running...
NPCD running (pid 1279).
ndo2db (pid 1337) is running...
CPU Load 15: 7.39
Total Hosts: 36
Total Services: 473
Function 'get_base_uri' returns: https://nagios.ficticious.com/nagiosxi/
Function 'get_base_url' returns: https://nagios.ficticious.com/nagiosxi/
Function 'get_backend_url(internal_call=false)' returns: https://nagios.ficticious.com/nagiosxi/includes/components/profile/profile.php
Function 'get_backend_url(internal_call=true)' returns: http://localhost/nagiosxi/backend/
Ping Test localhost
Running:
/bin/ping -c 3 localhost 2>&1
PING localhost.localdomain (127.0.0.1) 56(84) bytes of data.
64 bytes from localhost.localdomain (127.0.0.1): icmp_seq=1 ttl=64 time=0.014 ms
64 bytes from localhost.localdomain (127.0.0.1): icmp_seq=2 ttl=64 time=0.018 ms
64 bytes from localhost.localdomain (127.0.0.1): icmp_seq=3 ttl=64 time=0.018 ms
--- localhost.localdomain ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2006ms
rtt min/avg/max/mdev = 0.014/0.016/0.018/0.005 ms
Test wget To locahost
WGET From URL: http://localhost/nagiosql/index.php
Running:
/usr/bin/wget http://localhost/nagiosql/index.php
--2014-10-27 15:14:40-- http://localhost/nagiosql/index.php
Resolving localhost... 127.0.0.1
Connecting to localhost|127.0.0.1|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 5259 (5.1K) [text/html]
Saving to: "/usr/local/nagiosxi/tmp/nagiosql_index.tmp"
0K ..... 100% 345M=0s
2014-10-27 15:14:40 (345 MB/s) - "/usr/local/nagiosxi/tmp/nagiosql_index.tmp" saved [5259/5259]