NagiosXI graph problems

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
ctretelea
Posts: 59
Joined: Fri Feb 17, 2017 5:43 pm

Re: NagiosXI graph problems

Post by ctretelea »

Hi tgriep,
here are the results:

Code: Select all

[root@ip-10-60-0-29] 16:53 # grep "date.timezone" /etc/php.ini
; http://php.net/date.timezone
date.timezone = US/Eastern
[root@ip-10-60-0-29] 16:53 # ls -l /etc/localtime
lrwxrwxrwx. 1 root root 30 Feb 17  2017 /etc/localtime -> /usr/share/zoneinfo/US/Eastern
[root@ip-10-60-0-29] 16:53 # php -r 'echo date("D M j G:i:s T Y")."\n";'
Wed May 16 16:53:24 EDT 2018
[root@ip-10-60-0-29] 16:53 # date
Wed May 16 16:53:30 EDT 2018
[root@ip-10-60-0-29] 16:53 # echo "SELECT NOW();" | mysql -u root -pnagiosxi
NOW()
2018-05-16 16:53:36
[root@ip-10-60-0-29] 16:53 # php -r "echo date('r').PHP_EOL;"
Wed, 16 May 2018 16:53:43 -0400
[root@ip-10-60-0-29] 16:53 # strings /etc/localtime | tail -1
EST5EDT,M3.2.0,M11.1.0
[root@ip-10-60-0-29] 16:53 # cat /proc/mounts
rootfs / rootfs rw 0 0
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
devtmpfs /dev devtmpfs rw,nosuid,size=8111820k,nr_inodes=2027955,mode=755 0 0
securityfs /sys/kernel/security securityfs rw,nosuid,nodev,noexec,relatime 0 0
tmpfs /dev/shm tmpfs rw,nosuid,nodev 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /run tmpfs rw,nosuid,nodev,mode=755 0 0
tmpfs /sys/fs/cgroup tmpfs ro,nosuid,nodev,noexec,mode=755 0 0
cgroup /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd 0 0
pstore /sys/fs/pstore pstore rw,nosuid,nodev,noexec,relatime 0 0
cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0
cgroup /sys/fs/cgroup/hugetlb cgroup rw,nosuid,nodev,noexec,relatime,hugetlb 0 0
cgroup /sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpuacct,cpu 0 0
cgroup /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0
cgroup /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0
cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0
cgroup /sys/fs/cgroup/pids cgroup rw,nosuid,nodev,noexec,relatime,pids 0 0
cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0
cgroup /sys/fs/cgroup/net_cls,net_prio cgroup rw,nosuid,nodev,noexec,relatime,net_prio,net_cls 0 0
cgroup /sys/fs/cgroup/perf_event cgroup rw,nosuid,nodev,noexec,relatime,perf_event 0 0
configfs /sys/kernel/config configfs rw,relatime 0 0
/dev/xvda1 / xfs rw,relatime,attr2,inode64,noquota 0 0
rpc_pipefs /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0
systemd-1 /proc/sys/fs/binfmt_misc autofs rw,relatime,fd=28,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=14341 0 0
debugfs /sys/kernel/debug debugfs rw,relatime 0 0
hugetlbfs /dev/hugepages hugetlbfs rw,relatime 0 0
mqueue /dev/mqueue mqueue rw,relatime 0 0
nfsd /proc/fs/nfsd nfsd rw,relatime 0 0
s3fs /store/backups/nagiosxi1 fuse.s3fs rw,noatime,user_id=0,group_id=0,allow_other 0 0
/dev/mapper/data_vg1-mysql_lv1 /var/lib/mysql ext4 rw,relatime,data=ordered 0 0
/dev/mapper/data_vg1-perf_lv1 /apps/perf ext4 rw,relatime,data=ordered 0 0
tmpfs /run/user/1004 tmpfs rw,nosuid,nodev,relatime,size=1626648k,mode=700,uid=1004,gid=1004 0 0
tmpfs /var/nagiosramdisk tmpfs rw,relatime,size=2097152k 0 0
tmpfs /run/user/1001 tmpfs rw,nosuid,nodev,relatime,size=1626648k,mode=700,uid=1001,gid=1001 0 0
You do not have the required permissions to view the files attached to this post.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: NagiosXI graph problems

Post by tgriep »

Couple of things that could cause this.
The time is out of sync or jittering so the system is writing data in to the past so I recommend enabling ntp on the server.

The other is a duplicated process is trying to write the same data at different times to the rrd files.

Lets stop all of the Nagios processes on the system and start them up again to be sure there is not any duplicated performance data processes.
Run the following as root.

Code: Select all

service npcd stop
service nagios stop
service ndo2db stop
service mysqld restart
rm -rf /usr/local/nagios/var/rw/nagios.cmd
rm -rf /usr/local/nagios/var/nagios.lock
rm -rf /usr/local/nagios/var/ndo.sock
rm -rf /usr/local/nagios/var/ndo2db.lock
rm -rf /usr/local/nagiosxi/var/reconfigure_nagios.lock
rm -rf /var/lib/mrtg/mrtg_l
for i in `ipcs -q | grep nagios |awk '{print $2}'`; do ipcrm -q $i; done
pkill -9 -u nagios
service httpd restart
service ndo2db start
service nagios start
service npcd start
service crond restart
Let the system run for 10 minutes and see if it starts to process the performance data.

If it still fails, post the following file here.

Code: Select all

/var/nagiosramdisk/status.dat
Be sure to check out our Knowledgebase for helpful articles and solutions!
ctretelea
Posts: 59
Joined: Fri Feb 17, 2017 5:43 pm

Re: NagiosXI graph problems

Post by ctretelea »

Hi tgriep,
I have same issues, Please see attached log files in temp.zip and status.dat file in the status.zip file.

Thanks.
You do not have the required permissions to view the files attached to this post.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: NagiosXI graph problems

Post by tgriep »

Is there a chance we can move this to a ticket so we can get access to the live system to further troubleshoot this?

If so, you can use the following link to create a ticket.
https://support.nagios.com/tickets/

If you need help on creating the ticket, you can refer to this KB article.
https://support.nagios.com/kb/article/c ... r-769.html

One last question, do you have another Nagios server sending it's check results to this server?
Be sure to check out our Knowledgebase for helpful articles and solutions!
ctretelea
Posts: 59
Joined: Fri Feb 17, 2017 5:43 pm

Re: NagiosXI graph problems

Post by ctretelea »

Yes,we have Nagios Core outside that send passive check to the Nagios XI server.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: NagiosXI graph problems

Post by tgriep »

If that core system is running the same checks as the XI server with the same configurations, that could be causing duplications in the performance data.
Try stopping it for a while to see if that stops the errors and allows the system to graph the data.
Be sure to check out our Knowledgebase for helpful articles and solutions!
ctretelea
Posts: 59
Joined: Fri Feb 17, 2017 5:43 pm

Re: NagiosXI graph problems

Post by ctretelea »

Hi tgriep,
I stopped Core systems to send and that didn't help I still see error in the log file.
Also, what is interesting that leaving untouched files is the files that were processed with error see that:
the error log in npcd.log file :

Code: Select all

NPCD: ERROR: Executed command exits with return code '13'
[05-23-2018 12:51:02] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1527094246.perfdata.service'
created file after that is:

Code: Select all

[root@ip-10-60-0-29] 12:58 # ls -l 1527094246.*
-rw-r--r-- 1 nagios nagios 74536 May 23 12:50 1527094246.perfdata.service-PID-14485
after we see those untouched messages in the log file

Code: Select all

[05-23-2018 12:51:02] 
[root@ip-10-60-0-29] 13:00 # grep 1527094246.perfdata.service-PID-14485 /usr/local/nagios/var/npcd.log
[05-23-2018 12:51:19] NPCD: ThreadCounter 0/5 File is 1527094246.perfdata.service-PID-14485
[05-23-2018 12:51:19] NPCD: File '1527094246.perfdata.service-PID-14485' is an already in process PNP file. Leaving it untouched.
[05-23-2018 12:51:34] NPCD: ThreadCounter 0/5 File is 1527094246.perfdata.service-PID-14485
[05-23-2018 12:51:34] NPCD: File '1527094246.perfdata.service-PID-14485' is an already in process PNP file. Leaving it untouched.
[05-23-2018 12:51:52] NPCD: ThreadCounter 0/5 File is 1527094246.perfdata.service-PID-14485
[05-23-2018 12:51:52] NPCD: File '1527094246.perfdata.service-PID-14485' is an already in process PNP file. Leaving it untouched.
[05-23-2018 12:52:09] NPCD: ThreadCounter 0/5 File is 1527094246.perfdata.service-PID-14485
[05-23-2018 12:52:09] NPCD: File '1527094246.perfdata.service-PID-14485' is an already in process PNP file. Leaving it untouched.
[05-23-2018 12:52:26] NPCD: ThreadCounter 0/5 File is 1527094246.perfdata.service-PID-14485
[05-23-2018 12:52:26] NPCD: File '1527094246.perfdata.service-PID-14485' is an already in process PNP file. Leaving it untouched.
[05-23-2018 12:52:43] NPCD: ThreadCounter 0/5 File is 1527094246.perfdata.service-PID-14485
[05-23-2018 12:52:43] NPCD: File '1527094246.perfdata.service-PID-14485' is an already in process PNP file. Leaving it untouched.
[05-23-2018 12:53:00] NPCD: ThreadCounter 0/5 File is 1527094246.perfdata.service-PID-14485
[05-23-2018 12:53:00] NPCD: File '1527094246.perfdata.service-PID-14485' is an already in process PNP file. Leaving it untouched.
[05-23-2018 12:53:17] NPCD: ThreadCounter 0/5 File is 1527094246.perfdata.service-PID-14485
[05-23-2018 12:53:17] NPCD: File '1527094246.perfdata.service-PID-14485' is an already in process PNP file. Leaving it untouched.
....

tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: NagiosXI graph problems

Post by tmcdonald »

I see you have opened a ticket in our ticketing system as well, so I will be closing this thread. We will continue in the ticketing system.
Former Nagios employee
Locked