NagiosXI graph problems
Re: NagiosXI graph problems
Hi tgriep,
I checked and all files have nagios:nagios permissions in both locations.
I checked and all files have nagios:nagios permissions in both locations.
Re: NagiosXI graph problems
Can you run the following as root and post the /tmp/info.txt file to the post?
Are the errors still getting added to those log files?
Code: Select all
ps -ef --cols=300 >/tmp/info.txt
df -h >>/tmp/info.txt
df -i >>/tmp/info.txt
ls -lR /var/nagiosramdisk/spool >>/tmp/info.txt
ls -lR /usr/local/nagios/share/perfdata >>/tmp/info.txtBe sure to check out our Knowledgebase for helpful articles and solutions!
Re: NagiosXI graph problems
Hi tgriep,
I attached the file.
Yes, I still get errors in the log files:
I attached the file.
Yes, I still get errors in the log files:
Code: Select all
[ctretelea@ip-10-60-0-29] 08:50 $ tail -f /usr/local/nagios/var/perfdata.log
2018-05-15 08:08:53 [16190] [0] RRDs::update /usr/local/nagios/share/perfdata/ACE-ntacerep/_HOST_.rrd 1526386111:173.002:0:232.010:137.438
2018-05-15 08:08:53 [16190] [0] RRDs::update ERROR /usr/local/nagios/share/perfdata/ACE-ntacerep/_HOST_.rrd: illegal attempt to update using time 1526386111 when last update time is 1526386128 (minimum one second step)
2018-05-15 08:10:54 [19542] [0] RRDs::update /usr/local/nagios/share/perfdata/ACE-ntacerep/_HOST_.rrd 1526386227:149.578:0:178.561:124.684
2018-05-15 08:10:54 [19542] [0] RRDs::update ERROR /usr/local/nagios/share/perfdata/ACE-ntacerep/_HOST_.rrd: illegal attempt to update using time 1526386227 when last update time is 1526386244 (minimum one second step)
2018-05-15 08:11:24 [20463] [0] RRDs::update /usr/local/nagios/share/perfdata/petfood-OAKWMSSQLPRD01/_HOST_.rrd 1526386257:341.859:0:582.493:203.694
2018-05-15 08:11:24 [20463] [0] RRDs::update ERROR /usr/local/nagios/share/perfdata/petfood-OAKWMSSQLPRD01/_HOST_.rrd: illegal attempt to update using time 1526386257 when last update time is 1526386273 (minimum one second step)
2018-05-15 08:12:55 [22980] [0] RRDs::update /usr/local/nagios/share/perfdata/ACE-ntacerep/_HOST_.rrd 1526386342:167.215:0:188.599:138.562
2018-05-15 08:12:55 [22980] [0] RRDs::update ERROR /usr/local/nagios/share/perfdata/ACE-ntacerep/_HOST_.rrd: illegal attempt to update using time 1526386342 when last update time is 1526386359 (minimum one second step)
2018-05-15 08:14:55 [26352] [0] RRDs::update /usr/local/nagios/share/perfdata/ntkncdb/_HOST_.rrd 1526386466:164.024:0:180.802:150.197
2018-05-15 08:14:55 [26352] [0] RRDs::update ERROR /usr/local/nagios/share/perfdata/ntkncdb/_HOST_.rrd: illegal attempt to update using time 1526386466 when last update time is 1526386483 (minimum one second step)
Code: Select all
[ctretelea@ip-10-60-0-29] 08:50 $ tail -f /usr/local/nagios/var/npcd.log
[05-15-2018 08:46:56] NPCD: ERROR: Executed command exits with return code '13'
[05-15-2018 08:46:56] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1526388409.perfdata.service'
[05-15-2018 08:47:27] NPCD: ERROR: Executed command exits with return code '13'
[05-15-2018 08:47:27] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1526388439.perfdata.service'
[05-15-2018 08:47:57] NPCD: ERROR: Executed command exits with return code '13'
[05-15-2018 08:47:57] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1526388469.perfdata.service'
[05-15-2018 08:49:58] NPCD: ERROR: Executed command exits with return code '13'
[05-15-2018 08:49:58] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1526388589.perfdata.host'
[05-15-2018 08:50:28] NPCD: ERROR: Executed command exits with return code '13'
[05-15-2018 08:50:28] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1526388619.perfdata.service'
You do not have the required permissions to view the files attached to this post.
Re: NagiosXI graph problems
Lets clear out all of the old data in this folder.
Stop the processes by running
Delete all of the files in this folder
Then start the processes.
But first, it looks like someone created a symlink to the /usr/local/nagios/share/perfdata folder.
Code: Select all
/var/nagiosramdisk/spool/perfdataCode: Select all
service nagios stop
service npcd stop
service crond stopCode: Select all
/var/nagiosramdisk/spool/perfdataCode: Select all
service crond start
service npcd start
service nagios startBut first, it looks like someone created a symlink to the /usr/local/nagios/share/perfdata folder.
Make sure the permissions for the link is setup correctly.lrwxrwxrwx 1 nagios nagios 20 Feb 19 10:39 /usr/local/nagios/share/perfdata -> /apps/perf/perfdata/
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: NagiosXI graph problems
symlink link to the /usr/local/nagios/share/perfdata folder has right permissions:
I cleaned the folder:
But I still see the erros after:
Code: Select all
[root@ip-10-60-0-29] 12:57 # ls -l /apps/perf
total 28
drwx------ 2 nagios nagios 16384 Feb 19 12:03 lost+found
drwxrwxr-x 321 nagios nagios 12288 May 14 16:41 perfdata
Code: Select all
[root@ip-10-60-0-29] 12:58 # ls -l /var/nagiosramdisk/spool/perfdata
total 516
-rw-r--r-- 1 nagios nagios 8957 May 15 12:54 1526403263.perfdata.host-PID-19857
-rw-r--r-- 1 nagios nagios 101072 May 15 12:54 1526403263.perfdata.service-PID-19859
-rw-r--r-- 1 nagios nagios 102743 May 15 12:55 1526403323.perfdata.service-PID-21507
-rw-r--r-- 1 nagios nagios 94952 May 15 12:55 1526403338.perfdata.service-PID-21950
-rw-r--r-- 1 nagios nagios 90396 May 15 12:55 1526403353.perfdata.service-PID-22540
-rw-r--r-- 1 nagios nagios 8894 May 15 12:59 1526403563.perfdata.host-PID-27908
-rw-r--r-- 1 nagios nagios 100449 May 15 12:59 1526403563.perfdata.service-PID-27909
Code: Select all
[05-15-2018 12:53:15] NPCD: npcd Daemon (0.6.25) started with PID=16980
[05-15-2018 12:53:15] NPCD: Please have a look at 'npcd -V' to get license information
[05-15-2018 12:53:15] NPCD: HINT: load_threshold is enabled - ('10.000000')
[05-15-2018 12:54:46] NPCD: ERROR: Executed command exits with return code '13'
[05-15-2018 12:54:46] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1526403263.perfdata.host'
[05-15-2018 12:54:46] NPCD: ERROR: Executed command exits with return code '13'
[05-15-2018 12:54:46] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1526403263.perfdata.service'
[05-15-2018 12:55:34] NPCD: ERROR: Executed command exits with return code '13'
[05-15-2018 12:55:34] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1526403323.perfdata.service'
[05-15-2018 12:55:49] NPCD: ERROR: Executed command exits with return code '13'
[05-15-2018 12:55:49] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1526403338.perfdata.service'
[05-15-2018 12:56:04] NPCD: ERROR: Executed command exits with return code '13'
[05-15-2018 12:56:04] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1526403353.perfdata.service'
[05-15-2018 12:59:35] NPCD: ERROR: Executed command exits with return code '13'
[05-15-2018 12:59:35] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1526403563.perfdata.host'
Re: NagiosXI graph problems
Can you post one of the files from the /var/nagiosramdisk/spool/perfdata folder so I can view the contents of it?
They look larger than normal and I am guessing that could be the issue.
Then enabling debugging for NPCD be editing this file
Change this from
to
Save the file and restart npcd by running
Give it a few minutes and check the npcd.log file and post the errors so we can view them.
They look larger than normal and I am guessing that could be the issue.
Then enabling debugging for NPCD be editing this file
Code: Select all
/usr/local/nagios/etc/pnp/npcd.cfgCode: Select all
log_level = 0Code: Select all
log_level = -1Code: Select all
service npcd restartBe sure to check out our Knowledgebase for helpful articles and solutions!
Re: NagiosXI graph problems
Hi tgriep,
I attached file that contain the log files and one file from /var/nagiosramdisk/spool/perfdata/
also the I enabled the debuging and I see same error as before
If you need more info please ask for.
Thanks.
I attached file that contain the log files and one file from /var/nagiosramdisk/spool/perfdata/
also the I enabled the debuging and I see same error as before
Code: Select all
[05-16-2018 10:55:59] NPCD: DEBUG: load 0.430000/10.000000
[05-16-2018 10:55:59] NPCD: ThreadCounter 0/5 File is 1526482404.perfdata.service-PID-11644
[05-16-2018 10:55:59] NPCD: File '1526482404.perfdata.service-PID-11644' is an already in process PNP file. Leaving it untouched.
[05-16-2018 10:55:59] NPCD: DEBUG: load 0.430000/10.000000
Thanks.
You do not have the required permissions to view the files attached to this post.
Re: NagiosXI graph problems
If you look at this example message
And convert the Epoch time
The time step to update the rrd files should be around 60 seconds by default but they should never be in the past.
I was hoping that this was caused by the duplicate npcd processes from a few days ago and stopping it would fix it but make sure there is only one running now.
Another thing to look at is the systems time and the file system time.
If this folder
files are not updating correctly, that would cause the errors.
The files system file update time is set in the /etc/fstab file when it is getting mounted so check that file for the settings, especially a setting called noatime.
Some details about noatime can be found here.
http://en.tldp.org/LDP/solrhe/Securing- ... sec73.html
Also, you may want to enable NTP on the server to make sure the system time is correct and not drifting.
Code: Select all
2018-05-16 10:46:53 [32162] [0] RRDs::update ERROR /usr/local/nagios/share/perfdata/HL-APP1/_HOST_.rrd: illegal attempt to update using time 1526481962 when last update time is 1526481980 (minimum one second step)It shows that the system is trying to update the rrd file 18 seconds in the past so it will not update the rrd file.1526481962 converts to Wednesday May 16, 2018 10:46:02 (am) in time zone America/New York (EDT)
1526481980 converts to Wednesday May 16, 2018 10:46:20 (am) in time zone America/New York (EDT)
The time step to update the rrd files should be around 60 seconds by default but they should never be in the past.
I was hoping that this was caused by the duplicate npcd processes from a few days ago and stopping it would fix it but make sure there is only one running now.
Another thing to look at is the systems time and the file system time.
If this folder
Code: Select all
/apps/perf/perfdata/The files system file update time is set in the /etc/fstab file when it is getting mounted so check that file for the settings, especially a setting called noatime.
Some details about noatime can be found here.
http://en.tldp.org/LDP/solrhe/Securing- ... sec73.html
Also, you may want to enable NTP on the server to make sure the system time is correct and not drifting.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: NagiosXI graph problems
Hi y fstab is :
and time on the system up to date.
Is that ok?
Code: Select all
#
# /etc/fstab
# Created by anaconda on Mon Feb 22 17:08:22 2016
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
UUID=ef6ba050-6cdc-416a-9380-c14304d0d206 / xfs defaults 0 0
/home/swapfile1 swap swap defaults 0 0
/dev/mapper/data_vg1-mysql_lv1 /var/lib/mysql ext4 defaults 0 0
/dev/mapper/data_vg1-perf_lv1 /apps/perf ext4 defaults 0 0
Is that ok?
Re: NagiosXI graph problems
The fstab entries look OK.
Can you run the following as root and post the output?
Also, post these files as well.
Can you run the following as root and post the output?
Code: Select all
grep "date.timezone" /etc/php.ini
ls -l /etc/localtime
php -r 'echo date("D M j G:i:s T Y")."\n";'
date
echo "SELECT NOW();" | mysql -u root -pnagiosxi
php -r "echo date('r').PHP_EOL;"
strings /etc/localtime | tail -1
cat /proc/mountsCode: Select all
/usr/local/nagios/share/perfdata/HL-APP1/_HOST_.rrd
/usr/local/nagios/share/perfdata/HL-APP1/_HOST_.xmlBe sure to check out our Knowledgebase for helpful articles and solutions!