NagiosXI graph problems

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
ctretelea
Posts: 59
Joined: Fri Feb 17, 2017 5:43 pm

Re: NagiosXI graph problems

Post by ctretelea »

Hi tgriep,
I checked and all files have nagios:nagios permissions in both locations.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: NagiosXI graph problems

Post by tgriep »

Can you run the following as root and post the /tmp/info.txt file to the post?

Code: Select all

ps -ef --cols=300 >/tmp/info.txt
df -h >>/tmp/info.txt
df -i >>/tmp/info.txt
ls -lR /var/nagiosramdisk/spool >>/tmp/info.txt
ls -lR /usr/local/nagios/share/perfdata >>/tmp/info.txt
Are the errors still getting added to those log files?
Be sure to check out our Knowledgebase for helpful articles and solutions!
ctretelea
Posts: 59
Joined: Fri Feb 17, 2017 5:43 pm

Re: NagiosXI graph problems

Post by ctretelea »

Hi tgriep,
I attached the file.

Yes, I still get errors in the log files:

Code: Select all

[ctretelea@ip-10-60-0-29] 08:50 $ tail -f /usr/local/nagios/var/perfdata.log
2018-05-15 08:08:53 [16190] [0] RRDs::update /usr/local/nagios/share/perfdata/ACE-ntacerep/_HOST_.rrd 1526386111:173.002:0:232.010:137.438
2018-05-15 08:08:53 [16190] [0] RRDs::update ERROR /usr/local/nagios/share/perfdata/ACE-ntacerep/_HOST_.rrd: illegal attempt to update using time 1526386111 when last update time is 1526386128 (minimum one second step)
2018-05-15 08:10:54 [19542] [0] RRDs::update /usr/local/nagios/share/perfdata/ACE-ntacerep/_HOST_.rrd 1526386227:149.578:0:178.561:124.684
2018-05-15 08:10:54 [19542] [0] RRDs::update ERROR /usr/local/nagios/share/perfdata/ACE-ntacerep/_HOST_.rrd: illegal attempt to update using time 1526386227 when last update time is 1526386244 (minimum one second step)
2018-05-15 08:11:24 [20463] [0] RRDs::update /usr/local/nagios/share/perfdata/petfood-OAKWMSSQLPRD01/_HOST_.rrd 1526386257:341.859:0:582.493:203.694
2018-05-15 08:11:24 [20463] [0] RRDs::update ERROR /usr/local/nagios/share/perfdata/petfood-OAKWMSSQLPRD01/_HOST_.rrd: illegal attempt to update using time 1526386257 when last update time is 1526386273 (minimum one second step)
2018-05-15 08:12:55 [22980] [0] RRDs::update /usr/local/nagios/share/perfdata/ACE-ntacerep/_HOST_.rrd 1526386342:167.215:0:188.599:138.562
2018-05-15 08:12:55 [22980] [0] RRDs::update ERROR /usr/local/nagios/share/perfdata/ACE-ntacerep/_HOST_.rrd: illegal attempt to update using time 1526386342 when last update time is 1526386359 (minimum one second step)
2018-05-15 08:14:55 [26352] [0] RRDs::update /usr/local/nagios/share/perfdata/ntkncdb/_HOST_.rrd 1526386466:164.024:0:180.802:150.197
2018-05-15 08:14:55 [26352] [0] RRDs::update ERROR /usr/local/nagios/share/perfdata/ntkncdb/_HOST_.rrd: illegal attempt to update using time 1526386466 when last update time is 1526386483 (minimum one second step)

Code: Select all

[ctretelea@ip-10-60-0-29] 08:50 $ tail -f /usr/local/nagios/var/npcd.log
[05-15-2018 08:46:56] NPCD: ERROR: Executed command exits with return code '13'
[05-15-2018 08:46:56] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1526388409.perfdata.service'
[05-15-2018 08:47:27] NPCD: ERROR: Executed command exits with return code '13'
[05-15-2018 08:47:27] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1526388439.perfdata.service'
[05-15-2018 08:47:57] NPCD: ERROR: Executed command exits with return code '13'
[05-15-2018 08:47:57] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1526388469.perfdata.service'
[05-15-2018 08:49:58] NPCD: ERROR: Executed command exits with return code '13'
[05-15-2018 08:49:58] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1526388589.perfdata.host'
[05-15-2018 08:50:28] NPCD: ERROR: Executed command exits with return code '13'
[05-15-2018 08:50:28] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1526388619.perfdata.service'

You do not have the required permissions to view the files attached to this post.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: NagiosXI graph problems

Post by tgriep »

Lets clear out all of the old data in this folder.

Code: Select all

/var/nagiosramdisk/spool/perfdata
Stop the processes by running

Code: Select all

service nagios stop
service npcd stop
service crond stop
Delete all of the files in this folder

Code: Select all

/var/nagiosramdisk/spool/perfdata
Then start the processes.

Code: Select all

service crond start
service npcd start
service nagios start

But first, it looks like someone created a symlink to the /usr/local/nagios/share/perfdata folder.
lrwxrwxrwx 1 nagios nagios 20 Feb 19 10:39 /usr/local/nagios/share/perfdata -> /apps/perf/perfdata/
Make sure the permissions for the link is setup correctly.
Be sure to check out our Knowledgebase for helpful articles and solutions!
ctretelea
Posts: 59
Joined: Fri Feb 17, 2017 5:43 pm

Re: NagiosXI graph problems

Post by ctretelea »

symlink link to the /usr/local/nagios/share/perfdata folder has right permissions:

Code: Select all

[root@ip-10-60-0-29] 12:57 # ls -l /apps/perf
total 28
drwx------   2 nagios nagios 16384 Feb 19 12:03 lost+found
drwxrwxr-x 321 nagios nagios 12288 May 14 16:41 perfdata
I cleaned the folder:

Code: Select all

[root@ip-10-60-0-29] 12:58 # ls -l /var/nagiosramdisk/spool/perfdata
total 516
-rw-r--r-- 1 nagios nagios   8957 May 15 12:54 1526403263.perfdata.host-PID-19857
-rw-r--r-- 1 nagios nagios 101072 May 15 12:54 1526403263.perfdata.service-PID-19859
-rw-r--r-- 1 nagios nagios 102743 May 15 12:55 1526403323.perfdata.service-PID-21507
-rw-r--r-- 1 nagios nagios  94952 May 15 12:55 1526403338.perfdata.service-PID-21950
-rw-r--r-- 1 nagios nagios  90396 May 15 12:55 1526403353.perfdata.service-PID-22540
-rw-r--r-- 1 nagios nagios   8894 May 15 12:59 1526403563.perfdata.host-PID-27908
-rw-r--r-- 1 nagios nagios 100449 May 15 12:59 1526403563.perfdata.service-PID-27909
But I still see the erros after:

Code: Select all

[05-15-2018 12:53:15] NPCD: npcd Daemon (0.6.25) started with PID=16980
[05-15-2018 12:53:15] NPCD: Please have a look at 'npcd -V' to get license information
[05-15-2018 12:53:15] NPCD: HINT: load_threshold is enabled - ('10.000000')
[05-15-2018 12:54:46] NPCD: ERROR: Executed command exits with return code '13'
[05-15-2018 12:54:46] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1526403263.perfdata.host'
[05-15-2018 12:54:46] NPCD: ERROR: Executed command exits with return code '13'
[05-15-2018 12:54:46] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1526403263.perfdata.service'
[05-15-2018 12:55:34] NPCD: ERROR: Executed command exits with return code '13'
[05-15-2018 12:55:34] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1526403323.perfdata.service'
[05-15-2018 12:55:49] NPCD: ERROR: Executed command exits with return code '13'
[05-15-2018 12:55:49] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1526403338.perfdata.service'
[05-15-2018 12:56:04] NPCD: ERROR: Executed command exits with return code '13'
[05-15-2018 12:56:04] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1526403353.perfdata.service'
[05-15-2018 12:59:35] NPCD: ERROR: Executed command exits with return code '13'
[05-15-2018 12:59:35] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1526403563.perfdata.host'
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: NagiosXI graph problems

Post by tgriep »

Can you post one of the files from the /var/nagiosramdisk/spool/perfdata folder so I can view the contents of it?

They look larger than normal and I am guessing that could be the issue.

Then enabling debugging for NPCD be editing this file

Code: Select all

/usr/local/nagios/etc/pnp/npcd.cfg
Change this from

Code: Select all

log_level = 0
to

Code: Select all

log_level = -1
Save the file and restart npcd by running

Code: Select all

service npcd restart
Give it a few minutes and check the npcd.log file and post the errors so we can view them.
Be sure to check out our Knowledgebase for helpful articles and solutions!
ctretelea
Posts: 59
Joined: Fri Feb 17, 2017 5:43 pm

Re: NagiosXI graph problems

Post by ctretelea »

Hi tgriep,
I attached file that contain the log files and one file from /var/nagiosramdisk/spool/perfdata/
also the I enabled the debuging and I see same error as before

Code: Select all

[05-16-2018 10:55:59] NPCD: DEBUG: load 0.430000/10.000000
[05-16-2018 10:55:59] NPCD: ThreadCounter 0/5 File is 1526482404.perfdata.service-PID-11644
[05-16-2018 10:55:59] NPCD: File '1526482404.perfdata.service-PID-11644' is an already in process PNP file. Leaving it untouched.
[05-16-2018 10:55:59] NPCD: DEBUG: load 0.430000/10.000000
If you need more info please ask for.
Thanks.
You do not have the required permissions to view the files attached to this post.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: NagiosXI graph problems

Post by tgriep »

If you look at this example message

Code: Select all

2018-05-16 10:46:53 [32162] [0] RRDs::update ERROR /usr/local/nagios/share/perfdata/HL-APP1/_HOST_.rrd: illegal attempt to update using time 1526481962 when last update time is 1526481980 (minimum one second step)
And convert the Epoch time
1526481962 converts to Wednesday May 16, 2018 10:46:02 (am) in time zone America/New York (EDT)
1526481980 converts to Wednesday May 16, 2018 10:46:20 (am) in time zone America/New York (EDT)
It shows that the system is trying to update the rrd file 18 seconds in the past so it will not update the rrd file.
The time step to update the rrd files should be around 60 seconds by default but they should never be in the past.

I was hoping that this was caused by the duplicate npcd processes from a few days ago and stopping it would fix it but make sure there is only one running now.

Another thing to look at is the systems time and the file system time.
If this folder

Code: Select all

/apps/perf/perfdata/
files are not updating correctly, that would cause the errors.

The files system file update time is set in the /etc/fstab file when it is getting mounted so check that file for the settings, especially a setting called noatime.
Some details about noatime can be found here.
http://en.tldp.org/LDP/solrhe/Securing- ... sec73.html

Also, you may want to enable NTP on the server to make sure the system time is correct and not drifting.
Be sure to check out our Knowledgebase for helpful articles and solutions!
ctretelea
Posts: 59
Joined: Fri Feb 17, 2017 5:43 pm

Re: NagiosXI graph problems

Post by ctretelea »

Hi y fstab is :

Code: Select all

#
# /etc/fstab
# Created by anaconda on Mon Feb 22 17:08:22 2016
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
UUID=ef6ba050-6cdc-416a-9380-c14304d0d206 /                       xfs     defaults        0 0
/home/swapfile1 swap swap defaults 0 0
/dev/mapper/data_vg1-mysql_lv1  /var/lib/mysql  ext4    defaults        0 0
/dev/mapper/data_vg1-perf_lv1 /apps/perf                ext4    defaults        0 0
and time on the system up to date.

Is that ok?
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: NagiosXI graph problems

Post by tgriep »

The fstab entries look OK.

Can you run the following as root and post the output?

Code: Select all

grep "date.timezone" /etc/php.ini
ls -l /etc/localtime
php -r 'echo date("D M j G:i:s T Y")."\n";'
date
echo "SELECT NOW();" | mysql -u root -pnagiosxi
php -r "echo date('r').PHP_EOL;"
strings /etc/localtime | tail -1
cat /proc/mounts
Also, post these files as well.

Code: Select all

/usr/local/nagios/share/perfdata/HL-APP1/_HOST_.rrd
/usr/local/nagios/share/perfdata/HL-APP1/_HOST_.xml
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked