Event Manager (eventman) stale

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
WVUhealth
Posts: 78
Joined: Tue Apr 24, 2012 1:50 pm

Event Manager (eventman) stale

Post by WVUhealth »

My nagiosxi is showing red on Event Manager process..
It shows it last ran 4days 20 hours ago.
I am still getting valid alerts..

/usr/bin/php /usr/local/nagios/libexec/check_nagiosxiserver.php --address=localhost --url=https://localhost/nagiosxi/ --username=nag --ticket="s7" --mode=jobs
Event Manager (eventman) stale (416759 seconds old), Event Manager (eventman) stale (416759 seconds old)

I have 4 day old events showing up in /usr/local/nagiosxi/var/eventman.log
I have ran the cron /etc/cron.d/nagiosxi line manually nagios /usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php > /usr/local/nagiosxi/var/eventman.log

The few things still showing up in eventman.log are showing green on *bandwidth checks on a system* I validated they are in thresholds by other means.


I tried doing a maintenance like i found on the support site.. stopped nagios , nd02db, myslqd then restart mysqld /usr/local/nagiosxi/cron/dbmaint.php then starting the the other two.. still same results..
eventman is showing stale..


Last updates on the box were
Sep 15 11:41:20 Updated: libudev-147-2.73.el6_8.2.x86_64
Sep 15 11:41:21 Updated: libgudev1-147-2.73.el6_8.2.x86_64
Sep 15 11:41:23 Updated: ipa-python-3.0.0-50.el6_8.2.x86_64
Sep 15 11:41:23 Updated: libarchive-2.8.3-7.el6_8.x86_64
Sep 15 11:41:26 Updated: udev-147-2.73.el6_8.2.x86_64
Sep 15 16:34:10 Installed: php-tidy-5.3.3-1.el6.rf.x86_64
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Event Manager (eventman) stale

Post by rkennedy »

Usually this is because of cron. What is the output of service crond status and chage -l nagios?

Also, there are a few more culprits noted on this page which may help - https://support.nagios.com/kb/article.php?id=69
Former Nagios Employee
WVUhealth
Posts: 78
Joined: Tue Apr 24, 2012 1:50 pm

Re: Event Manager (eventman) stale

Post by WVUhealth »

I did check to see if cron was running and it was.. even rehupped it.. Sorry i did not post that
Sep 20 14:52:01 CROND[15095]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/reportengine.php > /usr/local/nagiosxi/var/reportengine.log 2>&1)
Sep 20 14:52:01 CROND[15097]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/nom.php > /usr/local/nagiosxi/var/nom.log 2>&1)
Sep 20 14:52:01 CROND[15098]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php > /usr/local/nagiosxi/var/feedproc.log 2>&1)
Sep 20 14:52:01 CROND[15099]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php > /usr/local/nagiosxi/var/eventman.log 2>&1)
Sep 20 14:52:01 CROND[15100]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cleaner.php > /usr/local/nagiosxi/var/cleaner.log 2>&1)
Sep 20 14:52:01 CROND[15103]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php > /usr/local/nagiosxi/var/cmdsubsys.log 2>&1)
Sep 20 14:52:01 CROND[15105]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php > /usr/local/nagiosxi/var/sysstat.log 2>&1)
Sep 20 14:52:01 CROND[15101]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php > /usr/local/nagiosxi/var/perfdataproc.log 2>&1)
Sep 20 14:53:01 CROND[30775]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/reportengine.php > /usr/local/nagiosxi/var/reportengine.log 2>&1)
Sep 20 14:53:01 CROND[30776]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cleaner.php > /usr/local/nagiosxi/var/cleaner.log 2>&1)
Sep 20 14:53:01 CROND[30777]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php > /usr/local/nagiosxi/var/eventman.log 2>&1)
Sep 20 14:53:01 CROND[30778]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/nom.php > /usr/local/nagiosxi/var/nom.log 2>&1)
Sep 20 14:53:01 CROND[30780]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php > /usr/local/nagiosxi/var/perfdataproc.log 2>&1)
Sep 20 14:53:01 CROND[30781]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php > /usr/local/nagiosxi/var/sysstat.log 2>&1)
Sep 20 14:53:01 CROND[30782]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php > /usr/local/nagiosxi/var/cmdsubsys.log 2>&1)
Sep 20 14:53:01 CROND[30779]: (nagios) CMD (/usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php > /usr/local/nagiosxi/var/feedproc.log 2>&1)


service crond status
crond (pid 1333) is running...

chage -l nagios
Last password change : Feb 21, 2013
Password expires : never
Password inactive : never
Account expires : never
Minimum number of days between password change : 0
Maximum number of days between password change : 99999
Number of days of warning before password expires : 7
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Event Manager (eventman) stale

Post by rkennedy »

Anything logging in eventman.log?

Code: Select all

cat /usr/local/nagiosxi/var/eventman.log
Former Nagios Employee
WVUhealth
Posts: 78
Joined: Tue Apr 24, 2012 1:50 pm

Re: Event Manager (eventman) stale

Post by WVUhealth »

The output is from the events that happened on the 15th..


cat /usr/local/nagiosxi/var/eventman.log

Code: Select all

PROCESS EVENT: ID=3988418, SOURCE=2, TYPE=2, TIME=2016-09-15 18:00:46
*** GLOBAL HANDLER...
Array
(
    [event_id] => 3988418
    [event_source] => 2
    [event_type] => 2
    [event_time] => 2016-09-15 18:00:46
    [event_meta] => Array
        (
            [notification-type] => service
            [contact] => tgreaser
            [contactemail] => tgreaser@localhost
            [type] => PROBLEM
            [escalated] => 0
            [author] => 
            [comments] => 
            [host] => server-switch-2342
            [hostaddress] => server-switch-2342
            [hostalias] => server-switch-2342
            [hostdisplayname] => server-switch-2342
            [service] => XIV-iSCSI1.2 Bandwidth
            [hoststate] => UP
            [hoststateid] => 0
            [servicestate] => CRITICAL
            [servicestateid] => 2
            [lastservicestate] => CRITICAL
            [lastservicestateid] => 2
            [servicestatetype] => HARD
            [currentattempt] => 5
            [maxattempts] => 5
            [serviceeventid] => 2391585
            [serviceproblemid] => 1125664
            [serviceoutput] => CRITICAL - Current BW in: 823.74Mbps Out: 2.84Mbps
            [longserviceoutput] => 
            [datetime] => Thu Sept 15 18:00:46 EDT 2016
        )

    [logging_enabled] => 1
)
*** GLOBAL HANDLER (snmptrapsender)...
Array
(
    [event_id] => 3988418
    [event_source] => 2
    [event_type] => 2
    [event_time] => 2016-09-15 18:00:46
    [event_meta] => Array
        (
            [notification-type] => service
            [contact] => tgreaser
            [contactemail] => tgreaser@localhost
            [type] => PROBLEM
            [escalated] => 0
            [author] => 
            [comments] => 
            [host] => server-switch-2342
            [hostaddress] => server-switch-2342
            [hostalias] => server-switch-2342
            [hostdisplayname] => server-switch-2342
            [service] => XIV-iSCSI1.2 Bandwidth
            [hoststate] => UP
            [hoststateid] => 0
            [servicestate] => CRITICAL
            [servicestateid] => 2
            [lastservicestate] => CRITICAL
            [lastservicestateid] => 2
            [servicestatetype] => HARD
            [currentattempt] => 5
            [maxattempts] => 5
            [serviceeventid] => 2391585
            [serviceproblemid] => 1125664
            [serviceoutput] => CRITICAL - Current BW in: 823.74Mbps Out: 2.84Mbps
            [longserviceoutput] => 
            [datetime] => Thu Sept 15 18:00:46 EDT 2016
        )

    [logging_enabled] => 1
)
Got XI user id for contact 'tgreaser': 101
An email notification will be sent...

Email Notification Data:

Array
(
    [from] => Nagios XI <nagios@localhost>
    [to] => tgreaser@localhost
    [subject] => PROBLEM Service Alert - server-switch-2342/XIV-iSCSI1.2 Bandwidth is CRITICAL
    [high_priority] => 0
    [message] => ***** Nagios XI Alert *****

Nagios has detected a problem with this service.

Notification Type: PROBLEM

Service: XIV-iSCSI1.2 Bandwidth
Host: server-switch-2342
Address: server-switch-2342
State: CRITICAL
Info:
CRITICAL - Current BW in: 823.74Mbps Out: 2.84Mbps
Date/Time: 09/21/2016 10:09:03

Respond: https://localhost/nagiosxi/rr.php?uid=101-11616-0304a913eb6904eb879458a212c5258c
Nagios URL: https://localhost/nagiosxi/

)
Last edited by tmcdonald on Wed Sep 21, 2016 9:51 am, edited 1 time in total.
Reason: Please use [code][/code] tags around long output
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Event Manager (eventman) stale

Post by rkennedy »

Could you PM over a profile for us to look at? This might have a bit more information in your logs where we'll be able to find something. (Admin -> System Profile -> Download Profile)
Former Nagios Employee
WVUhealth
Posts: 78
Joined: Tue Apr 24, 2012 1:50 pm

Re: Event Manager (eventman) stale

Post by WVUhealth »

Any head way with the support profile

I had an outage yesterday that others were not notified via email. *im glued to my anag app so i knew of the issues..
WVUhealth
Posts: 78
Joined: Tue Apr 24, 2012 1:50 pm

Re: Event Manager (eventman) stale

Post by WVUhealth »

Ok..
See were another sys admin did a force update on some php this the time eventman stopped working..
So it looks like i should roll back to use the -47 release .. ?

-rw-r--r-- 1 root root 2.2M Sep 15 16:33 php-cli-5.3.3-48.el6_8.x86_64.rpm
-rw-r--r-- 1 root root 530K Sep 15 16:33 php-common-5.3.3-48.el6_8.x86_64.rpm
-rw-r--r-- 1 root root 112K Sep 15 16:33 php-gd-5.3.3-48.el6_8.x86_64.rpm
-rw-r--r-- 1 root root 44K Sep 15 16:33 php-ldap-5.3.3-48.el6_8.x86_64.rpm
-rw-r--r-- 1 root root 87K Sep 15 16:33 php-mysql-5.3.3-48.el6_8.x86_64.rpm
-rw-r--r-- 1 root root 56K Sep 15 16:33 php-odbc-5.3.3-48.el6_8.x86_64.rpm
-rw-r--r-- 1 root root 80K Sep 15 16:33 php-pdo-5.3.3-48.el6_8.x86_64.rpm
-rw-r--r-- 1 root root 76K Sep 15 16:33 php-pgsql-5.3.3-48.el6_8.x86_64.rpm
-rw-r--r-- 1 root root 36K Sep 15 16:33 php-snmp-5.3.3-48.el6_8.x86_64.rpm
-rw-r--r-- 1 root root 146K Sep 15 16:33 php-soap-5.3.3-48.el6_8.x86_64.rpm
-rw-r--r-- 1 root root 109K Sep 15 16:33 php-xml-5.3.3-48.el6_8.x86_64.rpm
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Event Manager (eventman) stale

Post by rkennedy »

WVUhealth wrote:Any head way with the support profile

I had an outage yesterday that others were not notified via email. *im glued to my anag app so i knew of the issues..
I didn't receive a profile via PM, could you try resending it over to me?
WVUhealth wrote:Ok..
See were another sys admin did a force update on some php this the time eventman stopped working..
So it looks like i should roll back to use the -47 release .. ?

-rw-r--r-- 1 root root 2.2M Sep 15 16:33 php-cli-5.3.3-48.el6_8.x86_64.rpm
-rw-r--r-- 1 root root 530K Sep 15 16:33 php-common-5.3.3-48.el6_8.x86_64.rpm
-rw-r--r-- 1 root root 112K Sep 15 16:33 php-gd-5.3.3-48.el6_8.x86_64.rpm
-rw-r--r-- 1 root root 44K Sep 15 16:33 php-ldap-5.3.3-48.el6_8.x86_64.rpm
-rw-r--r-- 1 root root 87K Sep 15 16:33 php-mysql-5.3.3-48.el6_8.x86_64.rpm
-rw-r--r-- 1 root root 56K Sep 15 16:33 php-odbc-5.3.3-48.el6_8.x86_64.rpm
-rw-r--r-- 1 root root 80K Sep 15 16:33 php-pdo-5.3.3-48.el6_8.x86_64.rpm
-rw-r--r-- 1 root root 76K Sep 15 16:33 php-pgsql-5.3.3-48.el6_8.x86_64.rpm
-rw-r--r-- 1 root root 36K Sep 15 16:33 php-snmp-5.3.3-48.el6_8.x86_64.rpm
-rw-r--r-- 1 root root 146K Sep 15 16:33 php-soap-5.3.3-48.el6_8.x86_64.rpm
-rw-r--r-- 1 root root 109K Sep 15 16:33 php-xml-5.3.3-48.el6_8.x86_64.rpm
I don't think -48 is the issue, as that's what my system is running -

Code: Select all

[root@localhost scripts]# rpm -qa | grep php
php-pear-HTML-Template-IT-1.3.0-2.el5.noarch
php-pdo-5.3.3-48.el6_8.x86_64
php-mysql-5.3.3-48.el6_8.x86_64
php-ldap-5.3.3-48.el6_8.x86_64
php-php-gettext-1.0.11-12.el6.noarch
php-pecl-ssh2-0.11.0-7.el6.x86_64
php-mcrypt-5.3.3-4.el6.x86_64
php-common-5.3.3-48.el6_8.x86_64
php-bcmath-5.3.3-48.el6_8.x86_64
php-process-5.3.3-48.el6_8.x86_64
php-tidy-5.3.3-48.el6_8.x86_64
php-tcpdf-dejavu-sans-fonts-6.2.11-1.el6.noarch
php-cli-5.3.3-48.el6_8.x86_64
php-pgsql-5.3.3-48.el6_8.x86_64
php-gd-5.3.3-48.el6_8.x86_64
php-mbstring-5.3.3-48.el6_8.x86_64
php-xml-5.3.3-48.el6_8.x86_64
phpMyAdmin-4.0.10.17-2.el6.noarch
php-pear-1.9.4-5.el6.noarch
php-5.3.3-48.el6_8.x86_64
php-snmp-5.3.3-48.el6_8.x86_64
php-tcpdf-6.2.11-1.el6.noarch
php-mssql-5.3.3-4.el6.x86_64
Unless he did it in a different way, which wouldn't work with Nagios. Are you able to ask him what all he did at the time?
Former Nagios Employee
WVUhealth
Posts: 78
Joined: Tue Apr 24, 2012 1:50 pm

Re: Event Manager (eventman) stale

Post by WVUhealth »

I did get to verify the rpm -uvh was done Sept 15th.

I downloaded the rev back and installed them.. Like you posted had no effect.. So just upgraded php to current rev..
The reason i was thinking it was the issue.. it was the time the eventman stopped getting data to send notifications..
Last edited by WVUhealth on Sun Sep 25, 2016 8:07 am, edited 1 time in total.
Locked