Page 1 of 1

HELP! I have lost graph data

Posted: Wed Feb 19, 2014 10:50 am
by benningtonr
I was recieving repeat text sat. and rebooted the nagios server. Since then i have no graph data.

HELP!!!

Re: HELP! I have lost graph data

Posted: Wed Feb 19, 2014 11:28 am
by abrist
Lets start by tailing the perfdata logs:

Code: Select all

tail -25 /usr/local/nagios/var/perfdata.log
tail -25 /usr/local/nagios/var/npcd.log

Re: HELP! I have lost graph data

Posted: Wed Feb 19, 2014 11:31 am
by benningtonr
[ronb@nagios libexec]$ tail -25 /usr/local/nagios/var/perfdata.log
2014-01-28 05:10:12 [13191] [0] *** TIMEOUT: Please check your npcd.cfg
2014-01-28 05:10:12 [13191] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1390903786.perfdata.service-PID-13191 deleted
2014-01-28 05:10:12 [13191] [0] *** Timeout while processing Host: "" Service: ""
2014-01-28 05:10:12 [13191] [0] *** process_perfdata.pl terminated on signal ALRM
2014-01-28 05:10:12 [13192] [0] *** TIMEOUT: Timeout after 5 secs. ***
2014-01-28 05:10:12 [13192] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2014-01-28 05:10:12 [13192] [0] *** TIMEOUT: Please check your npcd.cfg
2014-01-28 05:10:12 [13192] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1390903801.perfdata.host-PID-13192 deleted
2014-01-28 05:10:12 [13192] [0] *** Timeout while processing Host: "" Service: ""
2014-01-28 05:10:12 [13192] [0] *** process_perfdata.pl terminated on signal ALRM
2014-01-28 05:10:12 [13193] [0] *** TIMEOUT: Timeout after 5 secs. ***
2014-01-28 05:10:12 [13189] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1390903786.perfdata.host-PID-13189 deleted
2014-01-28 05:10:12 [13193] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2014-01-28 05:10:12 [13193] [0] *** TIMEOUT: Please check your npcd.cfg
2014-01-28 05:10:12 [13189] [0] *** Timeout while processing Host: "" Service: ""
2014-01-28 05:10:12 [13193] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1390903801.perfdata.service-PID-13193 deleted
2014-01-28 05:10:12 [13189] [0] *** process_perfdata.pl terminated on signal ALRM
2014-01-28 05:10:12 [13193] [0] *** Timeout while processing Host: "" Service: ""
2014-01-28 05:10:12 [13193] [0] *** process_perfdata.pl terminated on signal ALRM
2014-01-28 05:10:47 [13804] [0] *** TIMEOUT: Timeout after 5 secs. ***
2014-01-28 05:10:47 [13804] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2014-01-28 05:10:47 [13804] [0] *** TIMEOUT: Please check your npcd.cfg
2014-01-28 05:10:47 [13804] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1390903831.perfdata.service-PID-13804 deleted
2014-01-28 05:10:47 [13804] [0] *** Timeout while processing Host: "cms1.whro.local" Service: "Memory_Usage"
2014-01-28 05:10:47 [13804] [0] *** process_perfdata.pl terminated on signal ALRM


[ronb@nagios libexec]$ tail -25 /usr/local/nagios/var/npcd.log
[12-23-2013 07:15:46] NPCD: ERROR: Executed command exits with return code '7'
[12-23-2013 07:15:46] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1387800927.perfdata.host'
[12-25-2013 04:20:50] NPCD: ERROR: Executed command exits with return code '7'
[12-25-2013 04:20:50] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1387963227.perfdata.service'
[01-15-2014 15:15:49] NPCD: ERROR: Executed command exits with return code '7'
[01-15-2014 15:15:49] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1389816925.perfdata.host'
[01-15-2014 15:15:49] NPCD: ERROR: Executed command exits with return code '7'
[01-15-2014 15:15:49] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1389816925.perfdata.service'
[01-16-2014 15:40:51] NPCD: ERROR: Executed command exits with return code '7'
[01-16-2014 15:40:51] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1389904825.perfdata.service'
[01-20-2014 05:10:53] NPCD: ERROR: Executed command exits with return code '7'
[01-20-2014 05:10:53] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1390212625.perfdata.host'
[01-20-2014 05:10:53] NPCD: ERROR: Executed command exits with return code '7'
[01-20-2014 05:10:53] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1390212625.perfdata.service'
[01-28-2014 05:10:12] NPCD: ERROR: Executed command exits with return code '7'
[01-28-2014 05:10:12] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1390903786.perfdata.service'
[01-28-2014 05:10:12] NPCD: ERROR: Executed command exits with return code '7'
[01-28-2014 05:10:12] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1390903786.perfdata.host'
[01-28-2014 05:10:12] NPCD: ERROR: Executed command exits with return code '7'
[01-28-2014 05:10:12] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1390903801.perfdata.service'
[01-28-2014 05:10:12] NPCD: ERROR: Executed command exits with return code '7'
[01-28-2014 05:10:12] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1390903801.perfdata.host'
[01-28-2014 05:10:47] NPCD: ERROR: Executed command exits with return code '7'
[01-28-2014 05:10:47] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1390903831.perfdata.service'
[02-15-2014 17:17:24] NPCD: Caught Termination Signal - Hasta la vista... baby

Re: HELP! I have lost graph data

Posted: Wed Feb 19, 2014 11:40 am
by abrist
benningtonr wrote:2014-01-28 05:10:12 [13192] [0] *** TIMEOUT: Timeout after 5 secs. ***
2014-01-28 05:10:12 [13192] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2014-01-28 05:10:12 [13192] [0] *** TIMEOUT: Please check your npcd.cfg
2014-01-28 05:10:12 [13192] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1390903801.perfdata.host-PID-13192 deleted
2014-01-28 05:10:12 [13192] [0] *** Timeout while processing Host: "" Service: ""
You are experiencing timeouts. Increase the threshold by editing:

Code: Select all

/usr/local/nagios/etc/pnp/process_perfdata.cfg
Change:

Code: Select all

TIMEOUT = 5
To:

Code: Select all

TIMEOUT = 25
Save out and restart npcd:

Code: Select all

service npcd restart

Re: HELP! I have lost graph data

Posted: Wed Feb 19, 2014 11:46 am
by benningtonr
okay it has been changed, but i have had this up for a couple years, why would it all of a sudden need changed?

Re: HELP! I have lost graph data

Posted: Wed Feb 19, 2014 11:53 am
by abrist
Have you added any checks as of late that may have increased the time necessary to receive and process checks? Or maybe some other tasks like backups, etc?

Re: HELP! I have lost graph data

Posted: Wed Feb 19, 2014 2:00 pm
by benningtonr
No checks, and no backup changes

Re: HELP! I have lost graph data

Posted: Wed Feb 19, 2014 2:01 pm
by benningtonr
yeah, graphs are back.

Re: HELP! I have lost graph data

Posted: Wed Feb 19, 2014 3:44 pm
by slansing
Did you do anything specific on your end to bring them back up?