No performance graph data since August
No performance graph data since August
RHEL 6.4
Manual installed Nagios XI, upgraded to 2012R2.3 today and didn't help the issue.
We've restarted NPCD today, made changes to timeout and load threshold (20 and 30.0) and restarted, still the data is not populating the graphs. Looks like it stopped working in August. We hoped an upgrade today would help but still not populating the graphs. We've reviewed the Wiki and can't figure out what to do next.
Manual installed Nagios XI, upgraded to 2012R2.3 today and didn't help the issue.
We've restarted NPCD today, made changes to timeout and load threshold (20 and 30.0) and restarted, still the data is not populating the graphs. Looks like it stopped working in August. We hoped an upgrade today would help but still not populating the graphs. We've reviewed the Wiki and can't figure out what to do next.
Re: No performance graph data since August
Run the following commands, and show the output:
Code: Select all
ls /usr/local/nagios/var/spool/checkresults | wc -l
ls /usr/local/nagios/var/spool/xidpe | wc -l
ls /usr/local/nagios/var/spool/perfdata | wc -l
grep "nagiosramdisk" /usr/local/nagios/etc/nagios.cfg
topBe sure to check out our Knowledgebase for helpful articles and solutions!
Re: No performance graph data since August
92lmiltchev wrote:Run the following commands, and show the output:
Code: Select all
ls /usr/local/nagios/var/spool/checkresults | wc -l
2lmiltchev wrote:Code: Select all
ls /usr/local/nagios/var/spool/xidpe | wc -l
380122lmiltchev wrote:Code: Select all
ls /usr/local/nagios/var/spool/perfdata | wc -l
emptylmiltchev wrote:Code: Select all
grep "nagiosramdisk" /usr/local/nagios/etc/nagios.cfg
Code: Select all
top - 14:20:21 up 39 min, 2 users, load average: 1.02, 1.41, 1.50
Tasks: 207 total, 1 running, 206 sleeping, 0 stopped, 0 zombie
Cpu(s): 5.8%us, 2.2%sy, 0.0%ni, 88.3%id, 3.7%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 3922928k total, 973940k used, 2948988k free, 50784k buffers
Swap: 1015800k total, 0k used, 1015800k free, 251892k cached
Re: No performance graph data since August
Is npcd running?
Code: Select all
service npcd statusFormer Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Re: No performance graph data since August
Code: Select all
[root@nagiosxi ~]# service npcd status
NPCD running (pid 28086). Re: No performance graph data since August
Lets check the perfdata and npcd logs:
Code: Select all
tail -25 /usr/local/nagios/var/perfdata.log
tail -25 /usr/local/nagios/var/npcd.logFormer Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Re: No performance graph data since August
perfdata.log (I've replaced hostnames with "host")
Looks like nothing in that log since we restarted NPCD this afternoon.
npcd.log
Code: Select all
[root@nagiosxi ~]# tail -25 /usr/local/nagios/var/perfdata.log
2013-09-11 12:32:37 [1287] [0] *** process_perfdata.pl terminated on signal ALRM
2013-09-11 12:32:45 [1424] [0] *** TIMEOUT: Timeout after 5 secs. ***
2013-09-11 12:32:45 [1424] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2013-09-11 12:32:45 [1424] [0] *** TIMEOUT: Please check your npcd.cfg
2013-09-11 12:32:45 [1424] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata //1375985007.perfdata.service-PID-1424 deleted
2013-09-11 12:32:45 [1424] [0] *** Timeout while processing Host: "host" Service: "__Disk_Usage"
2013-09-11 12:32:45 [1424] [0] *** process_perfdata.pl terminated on signal ALRM
2013-09-11 12:32:54 [1587] [0] *** TIMEOUT: Timeout after 5 secs. ***
2013-09-11 12:32:54 [1587] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2013-09-11 12:32:54 [1587] [0] *** TIMEOUT: Please check your npcd.cfg
2013-09-11 12:32:54 [1587] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata //1375985067.perfdata.service-PID-1587 deleted
2013-09-11 12:32:54 [1587] [0] *** Timeout while processing Host: "host" Service: "Users"
2013-09-11 12:32:54 [1587] [0] *** process_perfdata.pl terminated on signal ALRM
2013-09-11 12:33:04 [1782] [0] *** TIMEOUT: Timeout after 5 secs. ***
2013-09-11 12:33:04 [1782] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2013-09-11 12:33:04 [1782] [0] *** TIMEOUT: Please check your npcd.cfg
2013-09-11 12:33:04 [1782] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata //1375985247.perfdata.service-PID-1782 deleted
2013-09-11 12:33:04 [1782] [0] *** Timeout while processing Host: "host" Service: "Users"
2013-09-11 12:33:04 [1782] [0] *** process_perfdata.pl terminated on signal ALRM
2013-09-11 12:33:12 [2045] [0] *** TIMEOUT: Timeout after 5 secs. ***
2013-09-11 12:33:12 [2045] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2013-09-11 12:33:12 [2045] [0] *** TIMEOUT: Please check your npcd.cfg
2013-09-11 12:33:12 [2045] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata //1375985547.perfdata.service-PID-2045 deleted
2013-09-11 12:33:12 [2045] [0] *** Timeout while processing Host: "host" Service: "CPU_Stats"
2013-09-11 12:33:12 [2045] [0] *** process_perfdata.pl terminated on signal ALRM
npcd.log
Code: Select all
[root@nagiosxi ~]# tail -25 /usr/local/nagios/var/npcd.log
[09-11-2013 14:56:08] NPCD: Regular File: 1376107182.perfdata.service
[09-11-2013 14:56:08] NPCD: A thread was started on thread_counter = 2
[09-11-2013 14:56:08] NPCD: Processing file 1376107182.perfdata.service with ID 139949410256640 - going to exec /usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1376107182.perfdata.service
[09-11-2013 14:56:08] NPCD: Processing file '1376107182.perfdata.service'
[09-11-2013 14:56:08] NPCD: DEBUG: load 0.700000/30.000000
[09-11-2013 14:56:08] NPCD: ThreadCounter 3/5 File is 1376107197.perfdata.host
[09-11-2013 14:56:08] NPCD: Regular File: 1376107197.perfdata.host
[09-11-2013 14:56:08] NPCD: A thread was started on thread_counter = 3
[09-11-2013 14:56:08] NPCD: DEBUG: load 0.700000/30.000000
[09-11-2013 14:56:08] NPCD: ThreadCounter 4/5 File is 1376107197.perfdata.service
[09-11-2013 14:56:08] NPCD: Regular File: 1376107197.perfdata.service
[09-11-2013 14:56:08] NPCD: A thread was started on thread_counter = 4
[09-11-2013 14:56:08] NPCD: DEBUG: load 0.700000/30.000000
[09-11-2013 14:56:08] NPCD: ThreadCounter 5/5 File is 1376107212.perfdata.host
[09-11-2013 14:56:08] NPCD: Regular File: 1376107212.perfdata.host
[09-11-2013 14:56:08] NPCD: WARN: MAX Thread reached: 1376107212.perfdata.host comes later with ThreadCounter: 5
[09-11-2013 14:56:08] NPCD: DEBUG: Will wait for th['4']
[09-11-2013 14:56:08] NPCD: Processing file 1376107197.perfdata.host with ID 139949399766784 - going to exec /usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1376107197.perfdata.host
[09-11-2013 14:56:08] NPCD: Processing file '1376107197.perfdata.host'
[09-11-2013 14:56:08] NPCD: Processing file 1376107197.perfdata.service with ID 139949389276928 - going to exec /usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1376107197.perfdata.service
[09-11-2013 14:56:08] NPCD: Processing file '1376107197.perfdata.service'
[09-11-2013 14:56:11] NPCD: DEBUG: Will wait for th['3']
[09-11-2013 14:56:11] NPCD: DEBUG: Will wait for th['2']
[09-11-2013 14:56:11] NPCD: DEBUG: Will wait for th['1']
[09-11-2013 14:56:11] NPCD: DEBUG: Will wait for th['0']
Re: No performance graph data since August
What is the log level on the npcd.cfg and process_perfdata.cfg files?
If it is "0", modify both files, by changing the value to "1", and restart npcd:
Run the following commands, and show the output:
Open the "/usr/local/nagios/etc/pnp/process_perfdata.cfg" in a text editor, and set:
save, exit, restart npcd:
tail both logs, and show the output:
Code: Select all
grep -i "log_level =" /usr/local/nagios/etc/pnp/process_perfdata.cfg
grep -i "log_level =" /usr/local/nagios/etc/pnp/npcd.cfgCode: Select all
service npcd restartCode: Select all
grep -i "file_processing_interval" /usr/local/nagios/etc/nagios.cfg
ll /usr/local/nagios/libexec/process_perfdata.plCode: Select all
TIMEOUT = 15Code: Select all
service npcd restartCode: Select all
tail 30 /usr/local/nagios/var/npcd.log
tail 30 /usr/local/nagios/var/perfdata.logBe sure to check out our Knowledgebase for helpful articles and solutions!
Re: No performance graph data since August
Made the changes to log level on the process_perfdata.cfg since it was at 0. Left the npcd.cfg at -1 because I don't know if that will work just as well as 1. Let me know if I should still change it.
Code: Select all
[root@nagiosxi ~]# grep -i "log_level =" /usr/local/nagios/etc/pnp/process_perfdata.cfg
LOG_LEVEL = 0
[root@nagiosxi ~]# grep -i "log_level =" /usr/local/nagios/etc/pnp/npcd.cfg
# log_level = <integer value>
log_level = -1
[root@nagiosxi ~]# grep -i "file_processing_interval" /usr/local/nagios/etc/nagios.cfg
service_perfdata_file_processing_interval=15
host_perfdata_file_processing_interval=15
[root@nagiosxi ~]# ll /usr/local/nagios/libexec/process_perfdata.pl
-rwxr-xr-x. 1 nagios nagios 42724 Dec 5 2012 /usr/local/nagios/libexec/process_perfdata.pl
TIMEOUT = 20
[root@nagiosxi ~]# tail -30 /usr/local/nagios/var/npcd.log
[09-12-2013 08:51:59] NPCD: ThreadCounter 1/5 File is 1377138613.perfdata.host
[09-12-2013 08:51:59] NPCD: Regular File: 1377138613.perfdata.host
[09-12-2013 08:51:59] NPCD: Processing file 1377138598.perfdata.service with ID 140527962314496 - going to exec /usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1377138598.perfdata.service
[09-12-2013 08:51:59] NPCD: A thread was started on thread_counter = 1
[09-12-2013 08:51:59] NPCD: Processing file '1377138598.perfdata.service'
[09-12-2013 08:51:59] NPCD: DEBUG: load 0.140000/30.000000
[09-12-2013 08:51:59] NPCD: ThreadCounter 2/5 File is 1377138613.perfdata.service
[09-12-2013 08:51:59] NPCD: Regular File: 1377138613.perfdata.service
[09-12-2013 08:51:59] NPCD: A thread was started on thread_counter = 2
[09-12-2013 08:51:59] NPCD: DEBUG: load 0.140000/30.000000
[09-12-2013 08:51:59] NPCD: ThreadCounter 3/5 File is 1377138628.perfdata.host
[09-12-2013 08:51:59] NPCD: Regular File: 1377138628.perfdata.host
[09-12-2013 08:51:59] NPCD: Processing file 1377138613.perfdata.host with ID 140527951824640 - going to exec /usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1377138613.perfdata.host
[09-12-2013 08:51:59] NPCD: Processing file 1377138613.perfdata.service with ID 140527941334784 - going to exec /usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1377138613.perfdata.service
[09-12-2013 08:51:59] NPCD: Processing file '1377138613.perfdata.host'
[09-12-2013 08:51:59] NPCD: Processing file '1377138613.perfdata.service'
[09-12-2013 08:51:59] NPCD: Processing file 1377138628.perfdata.host with ID 140527930844928 - going to exec /usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1377138628.perfdata.host
[09-12-2013 08:51:59] NPCD: A thread was started on thread_counter = 3
[09-12-2013 08:51:59] NPCD: Processing file '1377138628.perfdata.host'
[09-12-2013 08:51:59] NPCD: DEBUG: load 0.140000/30.000000
[09-12-2013 08:51:59] NPCD: ThreadCounter 4/5 File is 1377138628.perfdata.service
[09-12-2013 08:51:59] NPCD: Regular File: 1377138628.perfdata.service
[09-12-2013 08:51:59] NPCD: A thread was started on thread_counter = 4
[09-12-2013 08:51:59] NPCD: DEBUG: load 0.140000/30.000000
[09-12-2013 08:51:59] NPCD: ThreadCounter 5/5 File is 1377138643.perfdata.host
[09-12-2013 08:51:59] NPCD: Regular File: 1377138643.perfdata.host
[09-12-2013 08:51:59] NPCD: WARN: MAX Thread reached: 1377138643.perfdata.host comes later with ThreadCounter: 5
[09-12-2013 08:51:59] NPCD: DEBUG: Will wait for th['4']
[09-12-2013 08:51:59] NPCD: Processing file 1377138628.perfdata.service with ID 140527920355072 - going to exec /usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1377138628.perfdata.service
[09-12-2013 08:51:59] NPCD: Processing file '1377138628.perfdata.service'
[root@nagiosxi ~]# tail -30 /usr/local/nagios/var/perfdata.log
2013-09-12 08:52:27 [9906] [1] Found Performance Data for nasbaorgwp.nasba.int / __Disk_Usage (/=19517MB;31075;34959;0;38844)
2013-09-12 08:52:27 [9912] [1] Found Performance Data for cpacentral.nasba.org / HTTP (time=0.062113s;;;0.000000 size=13622B;;;0)
2013-09-12 08:52:27 [9906] [1] Found Performance Data for web_mantis.nasba.int / Web_Page_Content (time=0.224373s;;;0.000000 size=625B;;;0)
2013-09-12 08:52:27 [9912] [1] Found Performance Data for drcpt.nasba.dr / Open_Files (opened_files=704;29655;49425)
2013-09-12 08:52:27 [9906] [1] Found Performance Data for nasbaorgwp.nasba.int / Swap_Usage (swap=972MB;0;0;0;991)
2013-09-12 08:52:27 [9912] [1] Found Performance Data for nasweb.nasba.int / CPU_Stats (user=0.00% system=0.20% iowait=0.00%;85;95 idle=99.80%)
2013-09-12 08:52:27 [9906] [1] Found Performance Data for phoneftp.nsbaonp.int / Ping (rta=49.530ms;3000.000;5000.000;0; pl=0%;80;100;;)
2013-09-12 08:52:27 [9912] [1] Found Performance Data for cpamobility.nasba.int / Load (load1=0.000;15.000;30.000;0; load5=0.000;10.000;20.000;0; load15=0.000;5.000;10.000;0;)
2013-09-12 08:52:27 [9906] [1] Found Performance Data for web02.nasba.qa / Ping (rta=49.690ms;3000.000;5000.000;0; pl=0%;80;100;;)
2013-09-12 08:52:27 [9906] [1] Found Performance Data for Sonicwall_One_Nashville_Place / Ping (rta=49.158ms;3000.000;5000.000;0; pl=0%;80;100;;)
2013-09-12 08:52:27 [9912] [1] Found Performance Data for cpaesdb01.nasba.int / Open_Files (opened_files=608;979716;1632861)
2013-09-12 08:52:27 [9912] [1] Found Performance Data for drkvm03.nasba.dr / Users (users=0;5;10;0)
2013-09-12 08:52:27 [9906] [1] Found Performance Data for db02qa2.nasba.int / Ping (rta=49.162ms;3000.000;5000.000;0; pl=0%;80;100;;)
2013-09-12 08:52:27 [9912] [1] Found Performance Data for drkvm03.nasba.dr / _boot_Disk_Usage (/boot=90MB;387;435;0;484)
2013-09-12 08:52:27 [9906] [1] Found Performance Data for web04.nasba.qa / Ping (rta=49.143ms;3000.000;5000.000;0; pl=0%;80;100;;)
2013-09-12 08:52:27 [9912] [1] 132 lines processed
2013-09-12 08:52:27 [9912] [1] /usr/local/nagios/var/spool/perfdata//1377139063.perfdata.service-PID-9912 deleted
2013-09-12 08:52:27 [9912] [1] PNP exiting (runtime 0.067544s) ...
2013-09-12 08:52:27 [9906] [1] Found Performance Data for nasbaorgwp.nasba.int / Open_Files (opened_files=960;239567;399279)
2013-09-12 08:52:27 [9906] [1] Found Performance Data for db02qa2.nasba.int / Load (load1=0.000;15.000;30.000;0; load5=0.000;10.000;20.000;0; load15=0.000;5.000;10.000;0;)
2013-09-12 08:52:27 [9906] [1] Found Performance Data for drdb02.nasba.dr / Ping (rta=42.422ms;3000.000;5000.000;0; pl=0%;80;100;;)
2013-09-12 08:52:27 [9906] [1] Found Performance Data for ui02.kpmg.int / __Disk_Usage (/=2366MB;40316;45356;0;50396)
2013-09-12 08:52:27 [9906] [1] Found Performance Data for logserver.nasba.int / __Disk_Usage (/=41073MB;106878;120238;0;133598)
2013-09-12 08:52:27 [9906] [1] Found Performance Data for wordpress02.nasba.int / __Disk_Usage (/=3006MB;30268;34052;0;37836)
2013-09-12 08:52:27 [9906] [1] Found Performance Data for prodrel.nasba.int / Load (load1=0.040;15.000;30.000;0; load5=0.030;10.000;20.000;0; load15=0.000;5.000;10.000;0;)
2013-09-12 08:52:27 [9906] [1] Found Performance Data for cpaesweb04.nasba.int / Memory_Usage (total=7870MB free=7601MB used=533MB shared=0 buffers=106MB cached=264MB)
2013-09-12 08:52:27 [9906] [1] Found Performance Data for ces.nasba.int / Memory_Usage (total=996MB free=112MB used=884MB shared=0 buffers=156MB cached=514MB)
2013-09-12 08:52:27 [9906] [1] 85 lines processed
2013-09-12 08:52:27 [9906] [1] /usr/local/nagios/var/spool/perfdata//1377139048.perfdata.service-PID-9906 deleted
2013-09-12 08:52:27 [9906] [1] PNP exiting (runtime 0.090943s) ...
Re: No performance graph data since August
You can change it to "1" if you wish. "-1" will give you too much info. Let's try removing all the files in the "/usr/local/nagios/var/spool/perfdata" directory. You will lose some perfdata, but we need to do this, so that you won't be timing out. Run the following commands:
Wait for 15-20 min, and check if perf graphs started to show up.
Code: Select all
cd /usr/local/nagios/var/spool
rm -rf perfdata
mkdir perfdata
chown nagios:nagios perfdata
chmod 755 perfdata
service npcd restartBe sure to check out our Knowledgebase for helpful articles and solutions!