centos 6, 2011R1.7
In my performance graphs, I only see data from several weeks ago. Unlike some others, the graphs do show up. Unfortunately, they are blank
in nagios/share/perfdata/ I see all folders for all my hosts. Within those hosts, I see the xml and rrd files
in /var/lib/mrtg/ I only see mrtg.ok, the perms on that folder are 775 apache:nagios
Here is some strange output that I see in npcd.log
[10-19-2011 14:57:09] NPCD: File 'service-perfdata.1319028965-PID-26441' is an already in process PNP file. Leaving it untouched.
[10-19-2011 14:57:09] NPCD: DEBUG: load 1.170000/10.000000
[10-19-2011 14:57:09] NPCD: ThreadCounter 1/5 File is service-perfdata.1319028980-PID-26756
[10-19-2011 14:57:09] NPCD: File 'service-perfdata.1319028980-PID-26756' is an already in process PNP file. Leaving it untouched.
[10-19-2011 14:57:09] NPCD: DEBUG: load 1.170000/10.000000
[10-19-2011 14:57:09] NPCD: ThreadCounter 1/5 File is service-perfdata.1319028995-PID-27101
[10-19-2011 14:57:09] NPCD: File 'service-perfdata.1319028995-PID-27101' is an already in process PNP file. Leaving it untouched.
[10-19-2011 14:57:09] NPCD: DEBUG: load 1.170000/10.000000
[10-19-2011 14:57:09] NPCD: ThreadCounter 1/5 File is service-perfdata.1319029010-PID-27482
[10-19-2011 14:57:09] NPCD: File 'service-perfdata.1319029010-PID-27482' is an already in process PNP file. Leaving it untouched.
[10-19-2011 14:57:09] NPCD: DEBUG: load 1.170000/10.000000
[10-19-2011 14:57:09] NPCD: ThreadCounter 1/5 File is service-perfdata.1319029025
[10-19-2011 14:57:09] NPCD: Regular File: service-perfdata.1319029025
[10-19-2011 14:57:09] NPCD: A thread was started on thread_counter = 1
[10-19-2011 14:57:09] NPCD: Have to wait: Filecounter = 481 - thread_counter = 2
[10-19-2011 14:57:09] NPCD: Processing file service-perfdata.1319029025 with ID -1226388624 - going to exec /usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//service-perfdata.1319029025
[10-19-2011 14:57:09] NPCD: Processing file 'service-perfdata.1319029025'
[10-19-2011 14:57:09] NPCD: ERROR: Executed command exits with return code '6'
[10-19-2011 14:57:09] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//host-perfdata.1319029025'
[10-19-2011 14:57:09] NPCD: ERROR: Executed command exits with return code '6'
[10-19-2011 14:57:09] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//service-perfdata.1319029025'
[10-19-2011 14:57:09] NPCD: No more files to process... waiting for 15 seconds
any ideas?
no performance data visible for last couple weeks
Re: no performance data visible for last couple weeks
This might be a permissions issue, can you run the following procedure and see if it resolves the issue:
There will be some missing files and error output from this script, which is normal.
http://library.nagios.com/library/produ ... -nagios-xi
There will be some missing files and error output from this script, which is normal.
http://library.nagios.com/library/produ ... -nagios-xi
Re: no performance data visible for last couple weeks
I ran that script before and it broke everything. Took me three days to fix.
for instance, that script sets numerous directory to use the group "users", a group which has no members. Please keep in mind that I am running the VM that you guys provided, which makes it even stranger.
Please let me know why it sets the group to "users"
for instance, that script sets numerous directory to use the group "users", a group which has no members. Please keep in mind that I am running the VM that you guys provided, which makes it even stranger.
Please let me know why it sets the group to "users"
Re: no performance data visible for last couple weeks
fao,
We've had a bit of problem with this, but I do believe I have a quick and comprehensive fix for you. First lets make sure that the problem is what I think it is. In the following example <rrd file> is some rrd file that you know to exist (preferably one from the /var/lib/mrtg/ directory. Type of the following:
That should return a completely clean value. If it says ANYTHING about errors, then that is definitely the issue. Thankfully, the fix is simple:
We've had a bit of problem with this, but I do believe I have a quick and comprehensive fix for you. First lets make sure that the problem is what I think it is. In the following example <rrd file> is some rrd file that you know to exist (preferably one from the /var/lib/mrtg/ directory. Type of the following:
Code: Select all
/usr/local/nagios/libexec/check_rrdtraf -f '/var/lib/mrtg/<rrd file>' -w 1 -c 2
Code: Select all
yum install bc
Nicholas Scott
Former Nagios employee
Former Nagios employee
Re: no performance data visible for last couple weeks
hey mguthrie, you are correct that the package 'bc' was missing
I installed it 2 hours ago then restarted nagiosxi, nagios, and npcd
Unfortunately, I still see nothing in the /var/lib/mrtg folder
[root@nagios ~]# ls -al /var/lib/mrtg/
total 8
drwxrwxr-x. 2 apache nagios 4096 Aug 29 15:25 .
drwxr-xr-x. 24 root root 4096 Oct 19 10:45 ..
-rw-r--r--. 1 apache nagios 0 Oct 21 17:00 mrtg.ok
nothing there unfortunately
i look at npcd.log and I see a lot of this
[10-21-2011 17:12:54] NPCD: Processing file service-perfdata.1319209972 with ID -1250956432 - going to exec /usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//service-perfdata.1319209972
[10-21-2011 17:12:54] NPCD: Processing file 'service-perfdata.1319209972'
[10-21-2011 17:12:54] NPCD: Processing file service-perfdata.1319209957 with ID -1240466576 - going to exec /usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//service-perfdata.1319209957
[10-21-2011 17:12:54] NPCD: Processing file 'service-perfdata.1319209957'
[10-21-2011 17:12:54] NPCD: ERROR: Executed command exits with return code '6'
[10-21-2011 17:12:54] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//service-perfdata.1319209972'
[10-21-2011 17:12:54] NPCD: ERROR: Executed command exits with return code '6'
[10-21-2011 17:12:54] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//service-perfdata.1319209957'
[10-21-2011 17:12:54] NPCD: No more files to process... waiting for 15 seconds
and in perfdata.log
2011-10-13 15:01:46 [15247] [0] *** ERROR: /usr/local/nagios/var/stats is not writable or does not exist
But it is writable and does exist
[root@nagios2 var]# ls -al stats/
total 12
drwxrwxr-x. 2 apache nagios 4096 Oct 21 17:13 .
drwxrwxr-x. 6 apache nagios 4096 Oct 21 17:14 ..
-rw-rw-rw- 1 nagios nagios 32 Oct 21 17:13 21986833
Any ideas?
I installed it 2 hours ago then restarted nagiosxi, nagios, and npcd
Unfortunately, I still see nothing in the /var/lib/mrtg folder
[root@nagios ~]# ls -al /var/lib/mrtg/
total 8
drwxrwxr-x. 2 apache nagios 4096 Aug 29 15:25 .
drwxr-xr-x. 24 root root 4096 Oct 19 10:45 ..
-rw-r--r--. 1 apache nagios 0 Oct 21 17:00 mrtg.ok
nothing there unfortunately
i look at npcd.log and I see a lot of this
[10-21-2011 17:12:54] NPCD: Processing file service-perfdata.1319209972 with ID -1250956432 - going to exec /usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//service-perfdata.1319209972
[10-21-2011 17:12:54] NPCD: Processing file 'service-perfdata.1319209972'
[10-21-2011 17:12:54] NPCD: Processing file service-perfdata.1319209957 with ID -1240466576 - going to exec /usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//service-perfdata.1319209957
[10-21-2011 17:12:54] NPCD: Processing file 'service-perfdata.1319209957'
[10-21-2011 17:12:54] NPCD: ERROR: Executed command exits with return code '6'
[10-21-2011 17:12:54] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//service-perfdata.1319209972'
[10-21-2011 17:12:54] NPCD: ERROR: Executed command exits with return code '6'
[10-21-2011 17:12:54] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//service-perfdata.1319209957'
[10-21-2011 17:12:54] NPCD: No more files to process... waiting for 15 seconds
and in perfdata.log
2011-10-13 15:01:46 [15247] [0] *** ERROR: /usr/local/nagios/var/stats is not writable or does not exist
But it is writable and does exist
[root@nagios2 var]# ls -al stats/
total 12
drwxrwxr-x. 2 apache nagios 4096 Oct 21 17:13 .
drwxrwxr-x. 6 apache nagios 4096 Oct 21 17:14 ..
-rw-rw-rw- 1 nagios nagios 32 Oct 21 17:13 21986833
Any ideas?
Re: no performance data visible for last couple weeks
Your /var/lib/mrtg/ should be owned and grouped with root.
chown root.root /var/lib/mrtg -R
The ownership on the stats directory should be nagios.nagios,
What are the permissions in your nagios/share/spool/perfdata ?
chown root.root /var/lib/mrtg -R
The ownership on the stats directory should be nagios.nagios,
What are the permissions in your nagios/share/spool/perfdata ?
Nicholas Scott
Former Nagios employee
Former Nagios employee
Re: no performance data visible for last couple weeks
tks
I do not have a share/spool/perfdata directory. Should I?
the permissions on nagios/var/spool/perfdata are apache.nagios
w/in the perfdata/
some file have the permission nagios.users and others nagios.nagios
I still don't understand why the "users" group keeps popping up when no one belongs to that group.
I do not have a share/spool/perfdata directory. Should I?
the permissions on nagios/var/spool/perfdata are apache.nagios
w/in the perfdata/
some file have the permission nagios.users and others nagios.nagios
I still don't understand why the "users" group keeps popping up when no one belongs to that group.
Re: no performance data visible for last couple weeks
here is most recent output of npcd.log
[10-24-2011 10:48:32] NPCD: DEBUG: load 0.560000/10.000000
[10-24-2011 10:48:32] NPCD: ThreadCounter 1/5 File is service-perfdata.1319446041-PID-27709
[10-24-2011 10:48:32] NPCD: File 'service-perfdata.1319446041-PID-27709' is an already in process PNP file. Leaving it untouched.
[10-24-2011 10:48:32] NPCD: DEBUG: load 0.560000/10.000000
[10-24-2011 10:48:32] NPCD: ThreadCounter 1/5 File is service-perfdata.1319446056-PID-27708
[10-24-2011 10:48:32] NPCD: File 'service-perfdata.1319446056-PID-27708' is an already in process PNP file. Leaving it untouched.
[10-24-2011 10:48:32] NPCD: DEBUG: load 0.560000/10.000000
[10-24-2011 10:48:32] NPCD: ThreadCounter 1/5 File is service-perfdata.1319446071-PID-28083
[10-24-2011 10:48:32] NPCD: File 'service-perfdata.1319446071-PID-28083' is an already in process PNP file. Leaving it untouched.
[10-24-2011 10:48:32] NPCD: DEBUG: load 0.560000/10.000000
[10-24-2011 10:48:32] NPCD: ThreadCounter 1/5 File is service-perfdata.1319446086-PID-28380
[10-24-2011 10:48:32] NPCD: File 'service-perfdata.1319446086-PID-28380' is an already in process PNP file. Leaving it untouched.
[10-24-2011 10:48:32] NPCD: DEBUG: load 0.560000/10.000000
[10-24-2011 10:48:32] NPCD: ThreadCounter 1/5 File is service-perfdata.1319446101
[10-24-2011 10:48:32] NPCD: Regular File: service-perfdata.1319446101
[10-24-2011 10:48:32] NPCD: A thread was started on thread_counter = 1
[10-24-2011 10:48:32] NPCD: Have to wait: Filecounter = 49387 - thread_counter = 2
[10-24-2011 10:48:32] NPCD: Processing file service-perfdata.1319446101 with ID -1227990160 - going to exec /usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//service-perfdata.1319446101
[10-24-2011 10:48:32] NPCD: Processing file 'service-perfdata.1319446101'
[10-24-2011 10:48:32] NPCD: ERROR: Executed command exits with return code '6'
[10-24-2011 10:48:32] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//service-perfdata.1319446101'
[10-24-2011 10:48:32] NPCD: No more files to process... waiting for 15 seconds
my perfdata.log is full of timeout errorsq
2011-09-30 17:11:15 [26026] [0] *** TIMEOUT: Timeout after 5 secs. ***
2011-09-30 17:11:15 [26026] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2011-09-30 17:11:15 [26026] [0] *** TIMEOUT: Please check your npcd.cfg
2011-09-30 17:11:15 [26026] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//service-perfdata.1317395460-PID-26026 deleted
2011-09-30 17:11:15 [26026] [0] *** Timeout while processing Host: "AFIT-INFRA05" Service: "CPU_Usage"
2011-09-30 17:11:15 [26026] [0] *** process_perfdata.pl terminated on signal ALRM
2011-10-05 05:00:43 [25184] [0] *** TIMEOUT: Timeout after 5 secs. ***
2011-10-05 05:00:43 [25184] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2011-10-05 05:00:43 [25184] [0] *** TIMEOUT: Please check your npcd.cfg
2011-10-05 05:00:43 [25184] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//service-perfdata.1317783623-PID-25184 deleted
2011-10-05 05:00:43 [25184] [0] *** Timeout while processing Host: "HQLQAPIRES1" Service: "__Disk_Usage"
2011-10-05 05:00:43 [25184] [0] *** process_perfdata.pl terminated on signal ALRM
2011-10-05 17:37:18 [3285] [0] *** TIMEOUT: Timeout after 5 secs. ***
2011-10-05 17:37:18 [3285] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2011-10-05 17:37:18 [3285] [0] *** TIMEOUT: Please check your npcd.cfg
2011-10-05 17:37:18 [3285] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//host-perfdata.1317829028-PID-3285 deleted
2011-10-05 17:37:18 [3285] [0] *** Timeout while processing Host: "LPRAPP04" Service: "_HOST_"
2011-10-05 17:37:18 [3285] [0] *** process_perfdata.pl terminated on signal ALRM
2011-10-05 17:37:18 [3287] [0] *** TIMEOUT: Timeout after 5 secs. ***
2011-10-05 17:37:18 [3287] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2011-10-05 17:37:18 [3287] [0] *** TIMEOUT: Please check your npcd.cfg
2011-10-05 17:37:18 [3287] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//host-perfdata.1317829002-PID-3287 deleted
2011-10-05 17:37:18 [3287] [0] *** Timeout while processing Host: "HQLPRTOMC02" Service: "_HOST_"
2011-10-05 17:37:18 [3287] [0] *** process_perfdata.pl terminated on signal ALRM
2011-10-05 22:31:01 [19906] [0] *** TIMEOUT: Timeout after 5 secs. ***
2011-10-05 22:31:01 [19906] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2011-10-05 22:31:01 [19906] [0] *** TIMEOUT: Please check your npcd.cfg
2011-10-05 22:31:03 [19906] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//service-perfdata.1317846647-PID-19906 deleted
2011-10-05 22:31:03 [19906] [0] *** Timeout while processing Host: "HQLPQTOMC02" Service: "CPU_Stats"
2011-10-05 22:31:03 [19906] [0] *** process_perfdata.pl terminated on signal ALRM
2011-10-13 15:01:46 [15247] [0] *** ERROR: /usr/local/nagios/var/stats is not writable or does not exist
[10-24-2011 10:48:32] NPCD: DEBUG: load 0.560000/10.000000
[10-24-2011 10:48:32] NPCD: ThreadCounter 1/5 File is service-perfdata.1319446041-PID-27709
[10-24-2011 10:48:32] NPCD: File 'service-perfdata.1319446041-PID-27709' is an already in process PNP file. Leaving it untouched.
[10-24-2011 10:48:32] NPCD: DEBUG: load 0.560000/10.000000
[10-24-2011 10:48:32] NPCD: ThreadCounter 1/5 File is service-perfdata.1319446056-PID-27708
[10-24-2011 10:48:32] NPCD: File 'service-perfdata.1319446056-PID-27708' is an already in process PNP file. Leaving it untouched.
[10-24-2011 10:48:32] NPCD: DEBUG: load 0.560000/10.000000
[10-24-2011 10:48:32] NPCD: ThreadCounter 1/5 File is service-perfdata.1319446071-PID-28083
[10-24-2011 10:48:32] NPCD: File 'service-perfdata.1319446071-PID-28083' is an already in process PNP file. Leaving it untouched.
[10-24-2011 10:48:32] NPCD: DEBUG: load 0.560000/10.000000
[10-24-2011 10:48:32] NPCD: ThreadCounter 1/5 File is service-perfdata.1319446086-PID-28380
[10-24-2011 10:48:32] NPCD: File 'service-perfdata.1319446086-PID-28380' is an already in process PNP file. Leaving it untouched.
[10-24-2011 10:48:32] NPCD: DEBUG: load 0.560000/10.000000
[10-24-2011 10:48:32] NPCD: ThreadCounter 1/5 File is service-perfdata.1319446101
[10-24-2011 10:48:32] NPCD: Regular File: service-perfdata.1319446101
[10-24-2011 10:48:32] NPCD: A thread was started on thread_counter = 1
[10-24-2011 10:48:32] NPCD: Have to wait: Filecounter = 49387 - thread_counter = 2
[10-24-2011 10:48:32] NPCD: Processing file service-perfdata.1319446101 with ID -1227990160 - going to exec /usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//service-perfdata.1319446101
[10-24-2011 10:48:32] NPCD: Processing file 'service-perfdata.1319446101'
[10-24-2011 10:48:32] NPCD: ERROR: Executed command exits with return code '6'
[10-24-2011 10:48:32] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//service-perfdata.1319446101'
[10-24-2011 10:48:32] NPCD: No more files to process... waiting for 15 seconds
my perfdata.log is full of timeout errorsq
2011-09-30 17:11:15 [26026] [0] *** TIMEOUT: Timeout after 5 secs. ***
2011-09-30 17:11:15 [26026] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2011-09-30 17:11:15 [26026] [0] *** TIMEOUT: Please check your npcd.cfg
2011-09-30 17:11:15 [26026] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//service-perfdata.1317395460-PID-26026 deleted
2011-09-30 17:11:15 [26026] [0] *** Timeout while processing Host: "AFIT-INFRA05" Service: "CPU_Usage"
2011-09-30 17:11:15 [26026] [0] *** process_perfdata.pl terminated on signal ALRM
2011-10-05 05:00:43 [25184] [0] *** TIMEOUT: Timeout after 5 secs. ***
2011-10-05 05:00:43 [25184] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2011-10-05 05:00:43 [25184] [0] *** TIMEOUT: Please check your npcd.cfg
2011-10-05 05:00:43 [25184] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//service-perfdata.1317783623-PID-25184 deleted
2011-10-05 05:00:43 [25184] [0] *** Timeout while processing Host: "HQLQAPIRES1" Service: "__Disk_Usage"
2011-10-05 05:00:43 [25184] [0] *** process_perfdata.pl terminated on signal ALRM
2011-10-05 17:37:18 [3285] [0] *** TIMEOUT: Timeout after 5 secs. ***
2011-10-05 17:37:18 [3285] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2011-10-05 17:37:18 [3285] [0] *** TIMEOUT: Please check your npcd.cfg
2011-10-05 17:37:18 [3285] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//host-perfdata.1317829028-PID-3285 deleted
2011-10-05 17:37:18 [3285] [0] *** Timeout while processing Host: "LPRAPP04" Service: "_HOST_"
2011-10-05 17:37:18 [3285] [0] *** process_perfdata.pl terminated on signal ALRM
2011-10-05 17:37:18 [3287] [0] *** TIMEOUT: Timeout after 5 secs. ***
2011-10-05 17:37:18 [3287] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2011-10-05 17:37:18 [3287] [0] *** TIMEOUT: Please check your npcd.cfg
2011-10-05 17:37:18 [3287] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//host-perfdata.1317829002-PID-3287 deleted
2011-10-05 17:37:18 [3287] [0] *** Timeout while processing Host: "HQLPRTOMC02" Service: "_HOST_"
2011-10-05 17:37:18 [3287] [0] *** process_perfdata.pl terminated on signal ALRM
2011-10-05 22:31:01 [19906] [0] *** TIMEOUT: Timeout after 5 secs. ***
2011-10-05 22:31:01 [19906] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2011-10-05 22:31:01 [19906] [0] *** TIMEOUT: Please check your npcd.cfg
2011-10-05 22:31:03 [19906] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//service-perfdata.1317846647-PID-19906 deleted
2011-10-05 22:31:03 [19906] [0] *** Timeout while processing Host: "HQLPQTOMC02" Service: "CPU_Stats"
2011-10-05 22:31:03 [19906] [0] *** process_perfdata.pl terminated on signal ALRM
2011-10-13 15:01:46 [15247] [0] *** ERROR: /usr/local/nagios/var/stats is not writable or does not exist
Re: no performance data visible for last couple weeks
alright, now I am getting some basic graphs
chmod -R g+x nagios/share/perfdata
seems to have done the trick
chmod -R g+x nagios/share/perfdata
seems to have done the trick
Re: no performance data visible for last couple weeks
Ok, thanks for the update!