Page 2 of 3
Re: Some Performance graphs not graphing
Posted: Thu Nov 11, 2021 11:30 am
by pbroste
Hello
@hbouma
Thanks for the updates,
Question; do you see a file count on this:
Code: Select all
watch 'ls /usr/local/nagios/var/spool/perfdata/ | wc -l'
Should see count fluctuate greater than zero and then when it is done completing performance data it will show up zero.
We should see the number of files increase in '/usr/local/nagios/var/spool/perfdata/' since the 'npcd' service has stopped. If that looks good, let's start the 'npcd' service again:
Now the file count in '/usr/local/nagios/var/spool/perfdata/' should be decreasing as it is processing perfdata. If the count is not decreasing then we should bump up values:
Code: Select all
vi /usr/local/nagios/etc/pnp/npcd.cfg
- Threshold value is changed from xx to 28 as it was reaching threshold breach with 10.
[list]- load_threshold = xx.x to load_threshold = 28.0
[*]process_perfdata.cfg: TIMEOUT value is changed from x to 40.[/*]
- TIMEOUT = x to TIMEOUT = 40
[/list]
Restart the npcd service:
Give it some time and
watch the counts and logs.
Want to get a copy of the 'process_perfdata.pl' and directory structure list so we can see ownership and permissions.
Code: Select all
tar -czvf /tmp/processperf.tar.gz /usr/local/nagios/libexec/process_perfdata.pl
Code: Select all
whereis rrdtool >/tmp/info.txt
ls -al /usr/local >>/tmp/info.txt
ls -alR /usr/local/nagios >>/tmp/info.txt
ls -alR /usr/local/nagiosxi >>/tmp/info.txt
tail -100 /usr/local/nagios/var/npcd.log >>/tmp/info.txt
Thanks,
Perry
Re: Some Performance graphs not graphing
Posted: Thu Nov 11, 2021 1:21 pm
by hbouma
The file count in /usr/local/nagios/var/spool/perfdata was 0 for several minutes' worth of the while statement. I did verify that the time was incrementing during the while statement. The value continued to stay at 0 even after stopping npcd.
Re: Some Performance graphs not graphing
Posted: Fri Nov 12, 2021 11:54 am
by pbroste
Hello
@hbouma
Looking through the logs it appears that this environment is also using 'nagiosramdisk'. The spool is located:
Code: Select all
'/var/nagiosramdisk/spool/perfdata/'
Please verify,
Perry
Re: Some Performance graphs not graphing
Posted: Tue Nov 16, 2021 7:40 am
by hbouma
Perfdata is showing up in that folder.
Code: Select all
07:39 AM SERVER root [/var/nagiosramdisk/spool/perfdata]
$ ll
total 516K
drwxrwxr-x 2 nagios nagios 240 Nov 16 07:39 .
drwxrwxr-x 5 nagios nagios 100 Nov 13 19:37 ..
-rw-r--r-- 1 nagios nagios 0 Nov 16 07:38 1637066312.perfdata.host
-rw-r--r-- 1 nagios nagios 83K Nov 16 07:38 1637066312.perfdata.service
-rw-r--r-- 1 nagios nagios 91K Nov 16 07:38 1637066326.perfdata.service
-rw-r--r-- 1 nagios nagios 333 Nov 16 07:38 1637066327.perfdata.host
-rw-r--r-- 1 nagios nagios 137K Nov 16 07:39 1637066341.perfdata.service
-rw-r--r-- 1 nagios nagios 1023 Nov 16 07:38 1637066342.perfdata.host
-rw-r--r-- 1 nagios nagios 59K Nov 16 07:39 1637066356.perfdata.host
-rw-r--r-- 1 nagios nagios 56K Nov 16 07:39 1637066357.perfdata.service
-rw-r--r-- 1 nagios nagios 22K Nov 16 07:39 1637066371.perfdata.service
-rw-r--r-- 1 nagios nagios 51K Nov 16 07:39 1637066372.perfdata.host
07:39 AM SERVER root [/var/nagiosramdisk/spool/perfdata]
$ ll
total 516K
drwxrwxr-x 2 nagios nagios 240 Nov 16 07:39 .
drwxrwxr-x 5 nagios nagios 100 Nov 13 19:37 ..
-rw-r--r-- 1 nagios nagios 0 Nov 16 07:38 1637066312.perfdata.host
-rw-r--r-- 1 nagios nagios 83K Nov 16 07:38 1637066312.perfdata.service
-rw-r--r-- 1 nagios nagios 91K Nov 16 07:38 1637066326.perfdata.service
-rw-r--r-- 1 nagios nagios 333 Nov 16 07:38 1637066327.perfdata.host
-rw-r--r-- 1 nagios nagios 137K Nov 16 07:39 1637066341.perfdata.service
-rw-r--r-- 1 nagios nagios 1023 Nov 16 07:38 1637066342.perfdata.host
-rw-r--r-- 1 nagios nagios 59K Nov 16 07:39 1637066356.perfdata.host
-rw-r--r-- 1 nagios nagios 56K Nov 16 07:39 1637066357.perfdata.service
-rw-r--r-- 1 nagios nagios 22K Nov 16 07:39 1637066371.perfdata.service
-rw-r--r-- 1 nagios nagios 51K Nov 16 07:39 1637066372.perfdata.host
07:39 AM SERVER root [/var/nagiosramdisk/spool/perfdata]
$ ll
total 512K
drwxrwxr-x 2 nagios nagios 200 Nov 16 07:39 .
drwxrwxr-x 5 nagios nagios 100 Nov 13 19:37 ..
-rw-r--r-- 1 nagios nagios 83K Nov 16 07:38 1637066312.perfdata.service-PID-15973
-rw-r--r-- 1 nagios nagios 91K Nov 16 07:38 1637066326.perfdata.service-PID-15975
-rw-r--r-- 1 nagios nagios 137K Nov 16 07:39 1637066341.perfdata.service-PID-15976
-rw-r--r-- 1 nagios nagios 1023 Nov 16 07:38 1637066342.perfdata.host
-rw-r--r-- 1 nagios nagios 59K Nov 16 07:39 1637066356.perfdata.host
-rw-r--r-- 1 nagios nagios 56K Nov 16 07:39 1637066357.perfdata.service
-rw-r--r-- 1 nagios nagios 22K Nov 16 07:39 1637066371.perfdata.service
-rw-r--r-- 1 nagios nagios 51K Nov 16 07:39 1637066372.perfdata.host
Re: Some Performance graphs not graphing
Posted: Tue Nov 16, 2021 5:36 pm
by pbroste
Hello
@hbouma
Thanks for following up, appears that there is a host that is showing up with zero bytes on perfdata, so we know that that one is not creating any graphs. The others look good and you stated that they are rotating through the move as npcd service is running.
Want to have you temporarily stop the npcd service and run the Perl script on one of the perfdata. Please scroll through and let me know if you see anything that breaks the running script.
From the spooled perfdata run:
Code: Select all
perl -d:Trace /usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata/xxxxxxxxxxxxxx.perfdata.service
Start the npcd service:
Thanks,
Perry
Re: Some Performance graphs not graphing
Posted: Wed Nov 17, 2021 1:53 pm
by hbouma
Unfortunately, we do not have the Trace.pm on our perl install.
Can't locate Devel/Trace.pm in @INC (@INC contains: /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .).
BEGIN failed--compilation aborted.
Re: Some Performance graphs not graphing
Posted: Thu Nov 18, 2021 11:15 am
by pbroste
Hello
@hbouma
Thanks for following up; I mentioned this case to our team during our stand-up this morning to get feedback and the following are suggestions that we were talking about.
Find out where 'rrdtool' is:
Code: Select all
/usr/local/nagios/etc/pnp/npcd.cfg
and change this:
RRDTOOL = /bin/rrdtool
To this:
RRDTOOL = /usr/bin/rrdtool
Want to have you install debug on RRD:
Then restart npcd:
Go through and run through the PNP setup:
Code: Select all
wget https://assets.nagios.com/downloads/nagiosxi/5/xi-5.8.6.tar.gz
rm -rf /tmp/nagiosxi
tar zxf https://assets.nagios.com/downloads/nagiosxi/5/xi-5.8.6.tar.gz
cd nagiosxi
./init.sh
cd nagiosxi/subcomponents/pnp
./install
Then re-implement ramdisk manually:
https://assets.nagios.com/downloads/nag ... giosXI.pdf
Thanks,
Perry
Re: Some Performance graphs not graphing
Posted: Thu Nov 18, 2021 2:43 pm
by hbouma
Odd, the RRDCached service is installed, but the /usr/local/nagios/etc/pnp/npcd.cfg file does not reference it.
Code: Select all
systemctl status rrdcached.service
● rrdcached.service - LSB: start and stop rrdtool caching daemon
Loaded: loaded (/etc/rc.d/init.d/rrdcached; bad; vendor preset: disabled)
Active: active (running) since Sat 2021-11-13 19:38:06 EST; 4 days ago
Docs: man:systemd-sysv-generator(8)
CGroup: /system.slice/rrdcached.service
└─3175 /usr/bin/rrdcached -p /var/rrdtool/rrdcached/rrdcached.pid -s nagios -m 0660 -l unix:/var/rrdtool/rrdcached/rrdcached.sock -F -w 900 -z 90 -j /tmp/ -b /var/rrdtool/rrdcached -P FLUSH,PEND...
Nov 18 14:38:06 SERVER rrdcached[3175]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/SERVER3/Memory_Usage.rrd) failed with status -1. (/usr/local/nagios/s...m 1637263678)
Nov 18 14:38:06 SERVER rrdcached[3175]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/SERVER3/Paging_File_Usage.rrd) failed with status -1. (/usr/local/nag...m 1637263689)
Nov 18 14:38:06 SERVER rrdcached[3175]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/SERVER2/Paging_File_Usage.rrd) failed with status -1. (/usr/local/nag...m 1637263703)
Nov 18 14:38:06 SERVER rrdcached[3175]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/SERVER1/Memory_Usage.rrd) failed with status -1. (/usr/local/nagios/s...m 1637263702)
Nov 18 14:38:06 SERVER rrdcached[3175]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/SERVER1/Paging_File_Usage.rrd) failed with status -1. (/usr/local/nag...m 1637263655)
Nov 18 14:38:06 SERVER rrdcached[3175]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/SERVER1 /Paging_File_Usage.rrd) failed with status -1. (/usr/local/nag...m 1637263657)
Nov 18 14:38:06 SERVER rrdcached[3175]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/SERVER1/Paging_File_Usage.rrd) failed with status -1. (/usr/local/nag...m 1637263747)
Nov 18 14:38:06 SERVER rrdcached[3175]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/SERVER1/Memory_Usage.rrd) failed with status -1. (/usr/local/nagios/s...m 1637263709)
Nov 18 14:38:06 SERVER rrdcached[3175]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/SERVER1/Disk_Usage_on_Y__.rrd) failed with status -1. (/usr/local/nag...m 1637263726)
Nov 18 14:38:06 SERVER rrdcached[3175]: queue_thread_main: rrd_update_r (/usr/local/nagios/share/perfdata/SERVER1/Memory_Usage.rrd) failed with status -1. (/usr/local/nagios/s...m 1637263724)
Hint: Some lines were ellipsized, use -l to show in full.
Re: Some Performance graphs not graphing
Posted: Fri Nov 19, 2021 12:44 pm
by pbroste
Hello
@hbouma
Thanks for responding with they systemd status.
Check your system time. If NTP pushes the time back by a couple of seconds, you may see this issue. Additionally, if you have more than 1 check configured that writes to the same RRD, you may experience this error as well. Please check to make sure that system date/time/timezone is sync'ed across all:
Code: Select all
date
ls -l /etc/localtime
php -r 'echo date("D M j G:i:s T Y")."\n";'
grep "date.timezone =" /etc/php.ini
grep date.timezone /etc/php.ini
mysql -h 127.0.0.1 -uroot -pnagiosxi -e 'SELECT NOW(); SELECT @@GLOBAL.time_zone, @@SESSION.time_zone;'
Please send the following:
Code: Select all
tar -czvf /tmp/npcdcnf.tar.gz /usr/local/nagios/etc/pnp/npcd.cfg /var/rrdtool/
Thanks,
Perry
Re: Some Performance graphs not graphing
Posted: Fri Nov 19, 2021 2:01 pm
by hbouma
I will send you the tar file in a PM.
Code: Select all
$ date
Fri Nov 19 13:58:15 EST 2021
01:58 PM SERVER root [~]
$ ls -l /etc/localtime
lrwxrwxrwx. 1 root root 38 Mar 6 2017 /etc/localtime -> ../usr/share/zoneinfo/America/New_York
01:58 PM SERVER root [~]
$ php -r 'echo date("D M j G:i:s T Y")."\n";'
Fri Nov 19 13:58:15 EST 2021
01:58 PM SERVER root [~]
$ grep "date.timezone =" /etc/php.ini
date.timezone = America/New_York
01:58 PM SERVER root [~]
$ grep date.timezone /etc/php.ini
; http://php.net/date.timezone
date.timezone = America/New_York
01:58 PM SERVER root [~]
$ mysql -h OFFLOADED_DB_IP -uroot -pSPECIAL_PASSWORD' -e 'SELECT NOW(); SELECT @@GLOBAL.time_zone, @@SESSION.time_zone;'
+---------------------+
| NOW() |
+---------------------+
| 2021-11-19 13:58:15 |
+---------------------+
+--------------------+---------------------+
| @@GLOBAL.time_zone | @@SESSION.time_zone |
+--------------------+---------------------+
| SYSTEM | SYSTEM |
+--------------------+---------------------+