Nagios XI Event Handlers skipping few events
-
- Posts: 9
- Joined: Wed Apr 25, 2018 8:06 pm
Nagios XI Event Handlers skipping few events
Hi,
We have 1000+ Hosts and 9000+ Services configured at Nagios XI. We have configured Event Handlers (shell script) at Global Event Handlers. The Host/Service events are triggered (as part of checks) and Event Handler Shell Script is being called as it is configured. The Shell Script composes a message and posts to Rabbit MQ. Internal Application would read from this Queue and process message, latest the notification would be sent to an internal dashboard tool.
But few notifications are missed out, which means we couldn't get any Message at Rabbit MQ. And our dashboards are showing the wrong status for the Host/Service. When we check at Nagios XI, we have the correct status of Host/Service and there is NO flapping detected. There are no downtime schedulers configured for any Host/Service.
We have verified at eventman.log and couldn't find any trace of missed Host/Service Notifications.
Could anyone help me with this.
Version: Nagios XI 5.4.10
We have 1000+ Hosts and 9000+ Services configured at Nagios XI. We have configured Event Handlers (shell script) at Global Event Handlers. The Host/Service events are triggered (as part of checks) and Event Handler Shell Script is being called as it is configured. The Shell Script composes a message and posts to Rabbit MQ. Internal Application would read from this Queue and process message, latest the notification would be sent to an internal dashboard tool.
But few notifications are missed out, which means we couldn't get any Message at Rabbit MQ. And our dashboards are showing the wrong status for the Host/Service. When we check at Nagios XI, we have the correct status of Host/Service and there is NO flapping detected. There are no downtime schedulers configured for any Host/Service.
We have verified at eventman.log and couldn't find any trace of missed Host/Service Notifications.
Could anyone help me with this.
Version: Nagios XI 5.4.10
-
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Nagios XI Event Handlers skipping few events
Do you see the events you are looking for in the "State History" report?
-
- Posts: 9
- Joined: Wed Apr 25, 2018 8:06 pm
Re: Nagios XI Event Handlers skipping few events
This scenario is happening more frequently. I could find an entry at Report/State History (changing states from OK to Non-OK and vise versa). But not Report/Event Log.
Re: Nagios XI Event Handlers skipping few events
Please PM me a copy of your profile, you can download it from Admin > System Profile > Download Profile.
Please send the timestamp (the change event you saw in XI), hostname, and servicename of one that you know didn't send it so we can analyze the configuration.
Thank you
Please send the timestamp (the change event you saw in XI), hostname, and servicename of one that you know didn't send it so we can analyze the configuration.
Thank you
-
- Posts: 9
- Joined: Wed Apr 25, 2018 8:06 pm
Re: Nagios XI Event Handlers skipping few events
As you requested, I have sent a PM along with Nagios System Profile.
Re: Nagios XI Event Handlers skipping few events
Received, thank you.
Please PM me the associated event handler scripts AND a what the full commands are in Admin > Manage Components > Global Event Handlers as well.
See if this alleviates it:
In your /usr/local/nagios/etc/nagios.cfg change this:
To this:
I'm wondering if some may be hitting the timeout.
Then restart the nagios service:
You can also set log_event_handlers=1 in your nagios.cfg to hopefully see what's occurring, watch the size of your /usr/local/nagios/var/nagios.log file, I don't want you to run out of space if you are limited. You must restart the nagios service after any nagios.cfg changes.
Thank you
Please PM me the associated event handler scripts AND a what the full commands are in Admin > Manage Components > Global Event Handlers as well.
See if this alleviates it:
In your /usr/local/nagios/etc/nagios.cfg change this:
Code: Select all
event_handler_timeout=30
Code: Select all
event_handler_timeout=60
Then restart the nagios service:
Code: Select all
systemctl restart nagios
Thank you
-
- Posts: 9
- Joined: Wed Apr 25, 2018 8:06 pm
Re: Nagios XI Event Handlers skipping few events
We have made configuration changes for *event_handler_timeout* to 60 as you suggested and restarted the Nagios.
Regarding *log_event_handlers* configuration change, we are evaluating the system configuration.
I have sent the requested information over PM.
Regarding *log_event_handlers* configuration change, we are evaluating the system configuration.
I have sent the requested information over PM.
Re: Nagios XI Event Handlers skipping few events
One cause is that the time zone of the server is not set, that sometimes causes issues you are having.
https://support.nagios.com/kb/article/n ... e-152.html
See if that helps.
Follow this article to fix that.PHP Timezone: America/Los_Angeles
PHP Time: Mon, 06 Apr 2020 05:40:53 -0700
System Time: Mon, 06 Apr 2020 08:40:53 -0400
https://support.nagios.com/kb/article/n ... e-152.html
See if that helps.
Be sure to check out our Knowledgebase for helpful articles and solutions!
-
- Posts: 9
- Joined: Wed Apr 25, 2018 8:06 pm
Re: Nagios XI Event Handlers skipping few events
We have made Nagios XI Time Zone changes to "US/Eastern" as per https://support.nagios.com/kb/article/n ... e-152.html.
The issue frequency got reduced, but we still see the issue.
We will be observing the behavior and will update if there is any change.
The issue frequency got reduced, but we still see the issue.
We will be observing the behavior and will update if there is any change.
Re: Nagios XI Event Handlers skipping few events
Your script is logging to this file.
Do you see the entries there?
One question, you are running that script when a state changes and also when a notification happens.
Are you missing entries for a state change, a notification or both?
Also, run the following as root on the Nagios server and post the output.
This will show all of the options for the eventhandler command.
Code: Select all
/usr/local/nagios/libexec/nagiosalerts/send-alert-request.log
One question, you are running that script when a state changes and also when a notification happens.
Are you missing entries for a state change, a notification or both?
Also, run the following as root on the Nagios server and post the output.
Code: Select all
echo 'select * from xi_options;' | mysql -u root -pnagiosxi nagiosxi |grep globaleventhandler_component_options
Be sure to check out our Knowledgebase for helpful articles and solutions!