Page 2 of 3

Re: Nagios Core Issue - Gaps in Graphs

Posted: Wed Mar 21, 2018 1:30 pm
by davemcfadden
Hi,
Ive attached a zip file with the npcd log and the perfdata files that were copied to /tmp.
Thanks,
Dave.

Re: Nagios Core Issue - Gaps in Graphs

Posted: Thu Mar 22, 2018 12:22 pm
by cdienger
The data looks good from what I can see but did you see any of the messages in the perfdata.log ? A message like this:

illegal attempt to update using time 0 when last update time is 1521482327 (minimum one second step)

is what I'm hoping to see in the logs and then check against the perfdata files.

Re: Nagios Core Issue - Gaps in Graphs

Posted: Thu Mar 22, 2018 1:43 pm
by davemcfadden
Hi,
This is very odd. I left the modified process_perfdata.pl in place. I can see the files are still being copied to the /tmp, but the perfdata.log file looks like it was rotated overnight and is empty...

Re: Nagios Core Issue - Gaps in Graphs

Posted: Thu Mar 22, 2018 1:53 pm
by davemcfadden
I looked at the previous perfdata.log file - it looks like nothing else was written to the perfdata.log file after the new perl script was put in place.

Re: Nagios Core Issue - Gaps in Graphs

Posted: Thu Mar 22, 2018 1:57 pm
by davemcfadden
Hi,
I guess the new perl file is getting an error...

[03-22-2018 13:55:48] NPCD: Processing file service-perfdata.1521744942 with ID 140482463823616 - going to exec /usr/libexec/pnp4nagios/process_perfdata.pl -n --bulk /var/spool/pnp4nagios/service-perfdata.1521744942
[03-22-2018 13:55:48] NPCD: Processing file 'service-perfdata.1521744942'
[03-22-2018 13:55:48] NPCD: ERROR: Executed command exits with return code '5'
[03-22-2018 13:55:48] NPCD: ERROR: Command line was '/usr/libexec/pnp4nagios/process_perfdata.pl -n --bulk /var/spool/pnp4nagios/service-perfdata.1521744942'
[03-22-2018 13:55:50] NPCD: No more files to process... waiting for 15 seconds

Re: Nagios Core Issue - Gaps in Graphs

Posted: Thu Mar 22, 2018 1:59 pm
by davemcfadden
sorry for the multiple posts - I should have combined these.
Get this with running the checker now:

Code: Select all

root@usmke1hstnagvp01l:[a212356335]: perl verify_pnp_config.pl --mode bulk+npcd --config=/etc/nagios/nagios.cfg --pnpcfg=/etc/pnp4nagios
[INFO]  ========== Starting Environment Checks ============
[INFO]  My version is: verify_pnp_config-0.6.26-R.40
[INFO]  Start Options: verify_pnp_config.pl --mode bulk+npcd --config=/etc/nagios/nagios.cfg --pnpcfg=/etc/pnp4nagios
[INFO]  Reading /etc/nagios/nagios.cfg
[OK  ]  Running product is 'nagios'
[OK  ]  object_cache_file is defined
[OK  ]  object_cache_file=/var/spool/nagios/objects.cache
[INFO]  Reading /var/spool/nagios/objects.cache
[OK  ]  resource_file is defined
[OK  ]  resource_file=/etc/nagios/private/resource.cfg
[INFO]  Reading /etc/nagios/private/resource.cfg
[INFO]  Reading /etc/pnp4nagios/process_perfdata.cfg
[INFO]  Reading /etc/pnp4nagios/pnp4nagios_release
[OK  ]  Found PNP4Nagios version "0.6.25"
[OK  ]  ./configure Options '--build=x86_64-redhat-linux-gnu' '--host=x86_64-redhat-linux-gnu' '--program-prefix=' '--disable-dependency-tracking' '--prefix=/usr' '--exec-prefix=/usr' '--bindir=/usr/bin' '--sbindir=/usr/sbin' '--sysconfdir=/etc' '--datadir=/usr/share' '--includedir=/usr/include' '--libdir=/usr/lib64' '--libexecdir=/usr/libexec' '--localstatedir=/var' '--sharedstatedir=/var/lib' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--bindir=/usr/sbin' '--libexecdir=/usr/libexec/pnp4nagios' '--sysconfdir=/etc/pnp4nagios' '--localstatedir=/var/log/pnp4nagios' '--datadir=/usr/share/nagios/html/pnp4nagios' '--datarootdir=/usr/share/nagios/html/pnp4nagios' '--with-perfdata-dir=/var/lib/pnp4nagios' '--with-perfdata-spool-dir=/var/spool/pnp4nagios' 'build_alias=x86_64-redhat-linux-gnu' 'host_alias=x86_64-redhat-linux-gnu' 'CFLAGS=-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches   -m64 -mtune=generic' 'LDFLAGS=-Wl,-z,relro '
[OK  ]  Effective User is 'nagios'
[OK  ]  User nagios exists with ID '993'
[OK  ]  Effective group is 'nagios'
[OK  ]  Group nagios exists with ID '990'
[INFO]  ========== Checking Bulk Mode + NPCD Config  ============
[OK  ]  process_performance_data is 1 compared with '/1/'
[OK  ]  service_perfdata_file is defined
[OK  ]  service_perfdata_file=/var/log/pnp4nagios/stats/service-perfdata
[OK  ]  service_perfdata_file_template is defined
[OK  ]  service_perfdata_file_template=DATATYPE::SERVICEPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tSERVICEDESC::$SERVICEDESC$\tSERVICEPERFDATA::$SERVICEPERFDATA$\tSERVICECHECKCOMMAND::$SERVICECHECKCOMMAND$\tHOSTSTATE::$HOSTSTATE$\tHOSTSTATETYPE::$HOSTSTATETYPE$\tSERVICESTATE::$SERVICESTATE$\tSERVICESTATETYPE::$SERVICESTATETYPE$
[OK  ]  PERFDATA template looks good
[OK  ]  service_perfdata_file_mode is defined
[OK  ]  service_perfdata_file_mode=a
[OK  ]  service_perfdata_file_processing_interval is defined
[OK  ]  service_perfdata_file_processing_interval=15
[OK  ]  service_perfdata_file_processing_command is defined
[OK  ]  service_perfdata_file_processing_command=process-service-perfdata-file
[OK  ]  host_perfdata_file is defined
[OK  ]  host_perfdata_file=/var/log/pnp4nagios/stats/host-perfdata
[OK  ]  host_perfdata_file_template is defined
[OK  ]  host_perfdata_file_template=DATATYPE::HOSTPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tHOSTPERFDATA::$HOSTPERFDATA$\tHOSTCHECKCOMMAND::$HOSTCHECKCOMMAND$\tHOSTSTATE::$HOSTSTATE$\tHOSTSTATETYPE::$HOSTSTATETYPE$
[OK  ]  PERFDATA template looks good
[OK  ]  host_perfdata_file_mode is defined
[OK  ]  host_perfdata_file_mode=a
[OK  ]  host_perfdata_file_processing_interval is defined
[OK  ]  host_perfdata_file_processing_interval=15
[OK  ]  host_perfdata_file_processing_command is defined
[OK  ]  host_perfdata_file_processing_command=process-host-perfdata-file
[INFO]  Nagios config looks good so far
[INFO]  ========== Checking config values ============
[OK  ]  npcd daemon is running
[OK  ]  /etc/pnp4nagios/npcd.cfg is used by npcd and readable
[INFO]  Reading /etc/pnp4nagios/npcd.cfg
[OK  ]  perfdata_spool_dir is defined
[OK  ]  perfdata_spool_dir=/var/spool/pnp4nagios
[CRIT]  10087 files found in /var/spool/pnp4nagios
[HINT]  Something went wrong here!
service_perfdata_file_processing_command at verify_pnp_config.pl line 462.
[OK  ]  Command process-service-perfdata-file is defined
[OK  ]  '/bin/mv /var/log/pnp4nagios/stats/service-perfdata /var/spool/pnp4nagios/service-perfdata.$TIMET$'
[OK  ]  Command looks good
host_perfdata_file_processing_command at verify_pnp_config.pl line 462.
[OK  ]  Command process-host-perfdata-file is defined
[OK  ]  '/bin/mv /var/log/pnp4nagios/stats/host-perfdata /var/spool/pnp4nagios/host-perfdata.$TIMET$'
[OK  ]  Command looks good
[OK  ]  Script /usr/libexec/pnp4nagios/process_perfdata.pl is executable
[INFO]  ========== Starting global checks ============
[OK  ]  status_file is defined
[OK  ]  status_file=/var/spool/nagios/status.dat
[INFO]  host_query =
[INFO]  service_query =
[INFO]  Reading /var/spool/nagios/status.dat
[INFO]  ==== Starting rrdtool checks ====
[OK  ]  RRDTOOL is defined
[OK  ]  RRDTOOL=/usr/bin/rrdtool
[OK  ]  /usr/bin/rrdtool is executable
[OK  ]  RRDtool 1.4.8  Copyright 1997-2013 by Tobias Oetiker <tobi@oetiker.ch>
[OK  ]  USE_RRDs is defined
[OK  ]  USE_RRDs=1
[OK  ]  Perl RRDs modules are loadable
[INFO]  ==== Starting directory checks ====
[OK  ]  RRDPATH is defined
[OK  ]  RRDPATH=/var/lib/pnp4nagios
[OK  ]  Perfdata directory '/var/lib/pnp4nagios' exists
[WARN]  252 hosts/services are not providing performance data
[WARN]  'process_perf_data 1' is set for 166 hosts/services which are not providing performance data!
[WARN]  'process_perf_data 0' is set for 87 of your hosts/services
[OK  ]  'process_perf_data 1' is set for 382 of your hosts/services
[WARN]  Logging is enabled in process_perfdata.cfg. This will reduce the overall performance of PNP4Nagios
[INFO]  ==== System sizing ====
[OK  ]  468 hosts/service objects defined
[INFO]  ==== Check statistics ====
[CRIT]  Warning: 4, Critical: 1
[CRIT]  Checks finished...
Thanks,
Dave.

Re: Nagios Core Issue - Gaps in Graphs

Posted: Thu Mar 22, 2018 2:26 pm
by davemcfadden
Hi,
OK there was some customization log and config files in the original perl script, so I have made those changes in the new one and now log files are being generated. I'll upload the logs and update service files in a few.
Many thanks,
Dave.

Re: Nagios Core Issue - Gaps in Graphs

Posted: Thu Mar 22, 2018 2:39 pm
by davemcfadden
Hi,
Attached is the host and performance data, and the perfdata.log.
Many thanks,
Dave.

Re: Nagios Core Issue - Gaps in Graphs

Posted: Fri Mar 23, 2018 10:31 am
by cdienger
Thanks for data and fixing the script to get it! Unfortunately it looks good but we've at least eliminated some possibilities.

The errors seen a thrown when the rrdtool command is run and tries to update runtime rrd files in /var/lib/pnp4nagios/.pnp-internal/. What version of the rrdtool file is installed? When you say you removed the rrd file and let them be recreated again, were you referring to the ones in /var/lib/pnp4nagios/.pnp-internal/ ? If not, try removing these and let them be created.

Re: Nagios Core Issue - Gaps in Graphs

Posted: Fri Mar 23, 2018 2:35 pm
by davemcfadden
Hi,
Thanks.
I have removed .rrd files for individual servers, but not in that directory.
I have removed them.
Interesting enough, this is from the .xml files in the directory:

runtime.xml.303: <TXT>creating '/var/lib/pnp4nagios/.pnp-internal/runtime_create.rrd': Permission denied</TXT>
runtime.xml.304: <TXT>creating '/var/lib/pnp4nagios/.pnp-internal/runtime_create.rrd': Permission denied</TXT>
runtime.xml.3649: <TXT>creating '/var/lib/pnp4nagios/.pnp-internal/runtime_create.rrd': Permission denied</TXT>
runtime.xml.722: <TXT>creating '/var/lib/pnp4nagios/.pnp-internal/runtime_create.rrd': Permission denied</TXT>
runtime.xml.725: <TXT>start time: unparsable time: DATATYPE::HOSTPERFDATA</TXT>

So hopefully we are getting closer.

Many thanks,
Dave.