Page 1 of 2
Monitoring Process Issue
Posted: Tue Aug 26, 2014 9:10 am
by jwessels
Hi Support,
I have a problem with the process monitor, it seems to be behind schedule?
As well as receiving this warning on all services in the eventlog
Warning: The check of service 'nagiosxi-64 VM Status' on host 'tsamarvca.tharisa.com' looks like it was orphaned (results never came back; last_check=1409025921; next_check=1409026579). I'm scheduling an immediate check of the service...
And after applying the configuration, when adding or editing hosts / services, it reports that the active host and service checks and notifications are disabled, this corrects after an hour
This issue started after the disk filled up with backups.
Re: Monitoring Process Issue
Posted: Tue Aug 26, 2014 1:01 pm
by lmiltchev
Do you have any database errors?
What is the output of the following command?
Code: Select all
grep embedded /usr/local/nagios/etc/nagios.cfg
Re: Monitoring Process Issue
Posted: Wed Aug 27, 2014 1:26 am
by jwessels
Hi
Here is the output
[root@tsamarnagios ~]# tail -25 /var/log/mysqld.log
140825 13:59:15 [Note] /usr/libexec/mysqld: Shutdown complete
140825 13:59:15 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended
140825 13:59:27 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
140825 13:59:28 InnoDB: Initializing buffer pool, size = 8.0M
140825 13:59:28 InnoDB: Completed initialization of buffer pool
140825 13:59:28 InnoDB: Started; log sequence number 0 44243
140825 13:59:28 [Note] Event Scheduler: Loaded 0 events
140825 13:59:28 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.1.73' socket: '/var/lib/mysql/mysql.sock' port: 3306 Source distribution
140825 14:19:09 [Note] /usr/libexec/mysqld: Normal shutdown
140825 14:19:09 [Note] Event Scheduler: Purging the queue. 0 events
140825 14:19:11 InnoDB: Starting shutdown...
140825 14:19:15 InnoDB: Shutdown completed; log sequence number 0 44243
140825 14:19:15 [Note] /usr/libexec/mysqld: Shutdown complete
140825 14:19:15 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended
140825 14:19:26 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
140825 14:19:26 InnoDB: Initializing buffer pool, size = 8.0M
140825 14:19:26 InnoDB: Completed initialization of buffer pool
140825 14:19:26 InnoDB: Started; log sequence number 0 44243
140825 14:19:26 [Note] Event Scheduler: Loaded 0 events
140825 14:19:26 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.1.73' socket: '/var/lib/mysql/mysql.sock' port: 3306 Source distribution
You have new mail in /var/spool/mail/root
[root@tsamarnagios ~]# grep embedded /usr/local/nagios/etc/nagios.cfg
enable_embedded_perl=0
use_embedded_perl_implicitly=0
Re: Monitoring Process Issue
Posted: Wed Aug 27, 2014 11:07 am
by tmcdonald
Is the timing off or is your system clock off?
Code: Select all
grep "date.timezone" /etc/php.ini
ls -l /etc/localtime
php -r 'echo date("D M j G:i:s T Y")."\n";'
date
Re: Monitoring Process Issue
Posted: Thu Aug 28, 2014 4:29 am
by jwessels
Code: Select all
root@tsamarnagios ~]# grep "date.timezone" /etc/php.ini
; http://www.php.net/manual/en/datetime.configuration.php#ini.date.timezone
date.timezone = Africa/Johannesburg
[root@tsamarnagios ~]# ls -l /etc/localtime
lrwxrwxrwx 1 root root 39 May 22 21:42 /etc/localtime -> /usr/share/zoneinfo/Africa/Johannesburg
[root@tsamarnagios ~]# php -r 'echo date("D M j G:i:s T Y")."\n";'
Thu Aug 28 8:28:34 SAST 2014
The local time looked wrong and I used the follwoing commands to set it
Code: Select all
mv /etc/localtime /etc/localtime.bak
ln -s /usr/share/zoneinfo/Africa/Johannesburg /etc/localtime
lrwxrwxrwx 1 root root 39 Aug 28 08:35 /etc/localtime -> /usr/share/zoneinfo/Africa/Johannesburg
That cleared the warning about the orphaned checks, but the eventlog still shows that the checks are now an hour behind
Capture.PNG
Re: Monitoring Process Issue
Posted: Thu Aug 28, 2014 10:20 am
by lmiltchev
Have you tried restarting the server after making the changes? You can also try running the following:
Code: Select all
service nagios stop
rm -f /usr/local/nagios/var/retention.dat
service nagios start
Note: The status of all of your checks will go to "Pending".
Re: Monitoring Process Issue
Posted: Tue Sep 02, 2014 12:27 am
by jwessels
Hi,
I have run the command and removed the retention file, this had no effect, the time of the events in the eventlog are still 8+ hours behind
Restarting the server has the same effect as restarting the nagios service, active checks and notifications are disabled.
I have to start the monitoring engine or the processing manually after restarting the appliance/ nagios service or when adding hosts/services to enable the checks, or wait 30+ minutes for it to start automatically.
Here is the status of the monitoring process.
monitor.PNG
Re: Monitoring Process Issue
Posted: Tue Sep 02, 2014 3:52 pm
by sreinhardt
What version of nagios XI are you presently running? Any other neb modules, such as mod_gearman or livestatus? Just to clarify, you did a full server reboot and time did not correct itself? We may need to clear the retention.dat file, so that scheduling will not be so far in the future since it is likely partially if not fully off by the previous time issues. Simply moving the file like below, and restarting the nagios daemon would recreate it.
Code: Select all
mv /usr/local/nagios/var/retention.dat /usr/local/nagios/var/retention.dat.old
Re: Monitoring Process Issue
Posted: Tue Sep 09, 2014 1:27 am
by jwessels
Hi
No custom configurations or modules, except for the install of the vmware perl sdk and yum updates.
VMware 64bit appliance
XI 2014R1.4
CentOS release 6.5 (Final)
cpe:/o:centos:linux:6:GA
Full server reboot(s) or nagios service restart(s) does not correct the time.
Renaming or removing the retention.dat does reset the checks to the correct time, but it doesnt keep up and ends up begind and the warning about orphaned ckecks are present again
I do get the following when restarting the nagios service: Warning - nagios did not exit in a timely manner
After restarting the nagios service the monitoring engine stops, I have to start this manually. Or the process state is stoped and i have to start it manually.
Re: Monitoring Process Issue
Posted: Tue Sep 09, 2014 9:06 am
by lmiltchev
If Nagios is not exiting in a timely manner, you can try following the steps, outlined on our FAQ wiki page here:
http://support.nagios.com/wiki/index.ph ... ely_manner
Let us know if this helped.