Multiple events created for the same alert.
Multiple events created for the same alert.
I get multiple events in the eventman.log for the same error, resulting in multiple notifications (see attachment).
In the console I see only one event.
Current XI version is: 5.6.14
In the console I see only one event.
Current XI version is: 5.6.14
You do not have the required permissions to view the files attached to this post.
-
benjaminsmith
- Posts: 5324
- Joined: Wed Aug 22, 2018 4:39 pm
- Location: saint paul
Re: Multiple events created for the same alert.
Hi,
This is most likely caused by multiple Nagios processes running on the server. To remedy, let's try killing all the Nagios processes and restart all the XI services. The following commands will work on Cent 7 and may need to be adjusted for other operating systems.
Hopefully that will take care of the problem, but If the issue persists, please send us your system profile and we'll take a closer look at the logs. Thanks, Benjamin
To send us your system profile.
Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and share in a private message or upload it to the post/ticket, and then reply to this post to bring it up in the queue.
This is most likely caused by multiple Nagios processes running on the server. To remedy, let's try killing all the Nagios processes and restart all the XI services. The following commands will work on Cent 7 and may need to be adjusted for other operating systems.
Code: Select all
systemctl stop crond
systemctl stop npcd
systemctl stop nagios
systemctl stop ndo2db
pkill -9 -u nagios
for i in $(ipcs -q | grep nagios |awk '{print $2}'); do ipcrm -q $i; done
rm -rf /usr/local/nagiosxi/var/dbmaint.lock
rm -rf /usr/local/nagiosxi/var/event_handler.lock
rm -rf /usr/local/nagiosxi/scripts/reconfigure_nagios.lock
systemctl restart mariadb
systemctl start ndo2db
systemctl start nagios
systemctl start npcd
systemctl start crond
To send us your system profile.
Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and share in a private message or upload it to the post/ticket, and then reply to this post to bring it up in the queue.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: Multiple events created for the same alert.
I've uploaded the system profile since the quick fix didn't work.
Moderator's Note: The profile has been shared with the support team but has been removed from the public forum.
Moderator's Note: The profile has been shared with the support team but has been removed from the public forum.
-
benjaminsmith
- Posts: 5324
- Joined: Wed Aug 22, 2018 4:39 pm
- Location: saint paul
Re: Multiple events created for the same alert.
Hi,
Thanks for the profile. There are several crashed database tables, please run the following command.
Then check the database logs to see if the tables are corrected or send over a fresh system profile. If the database is successfully repaired, let me know if the issue is corrected as well. Thanks, Benjamin
References
Repairing The Nagios XI Databases
Log Locations and Descriptions
Thanks for the profile. There are several crashed database tables, please run the following command.
Code: Select all
mysqlcheck -r -f -uroot -pnagiosxi --all-databases --use_frmReferences
Repairing The Nagios XI Databases
Log Locations and Descriptions
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: Multiple events created for the same alert.
I did the repair.
===============
REPAIR COMPLETE
===============
=======================
nagios database repair succeeded
nagiosql database repair succeeded
nagiosxi database repair succeeded
Sending a new profile.
But it didn't fix my issue.
===============
REPAIR COMPLETE
===============
=======================
nagios database repair succeeded
nagiosql database repair succeeded
nagiosxi database repair succeeded
Sending a new profile.
But it didn't fix my issue.
You do not have the required permissions to view the files attached to this post.
-
benjaminsmith
- Posts: 5324
- Joined: Wed Aug 22, 2018 4:39 pm
- Location: saint paul
Re: Multiple events created for the same alert.
Hi @Typer100,
The database is looking good now. Looking over the profile, I'm not seeing duplicate alerts in the logs. However, the profile is just the tail output of recent events.
If the issue happens again, please retrieve the full /usr/local/nagios/var/nagios.log along with the mail log ( depends on if you are using SMTP or Sendmail) from the server and let me know the exact name of the service that is sending duplicates. Thanks, Benjamin
Log Locations and Descriptions
The database is looking good now. Looking over the profile, I'm not seeing duplicate alerts in the logs. However, the profile is just the tail output of recent events.
If the issue happens again, please retrieve the full /usr/local/nagios/var/nagios.log along with the mail log ( depends on if you are using SMTP or Sendmail) from the server and let me know the exact name of the service that is sending duplicates. Thanks, Benjamin
Log Locations and Descriptions
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: Multiple events created for the same alert.
Pretty much all alerts are sending 10 emails.
Service: Disk Usage on /dbawork
Host: sldgbd0065
Address: 172.26.14.39
State: OK
Info:
OK: Used disk space was 34.90 % (Used: 12923.24 MB, Total_size: 38970.85 MB, Free: 24061.47 MB)
Date/Time: 2020-07-08 11:31:47
Service: Disk Usage on /dbawork
Host: sldgbd0065
Address: 172.26.14.39
State: OK
Info:
OK: Used disk space was 34.90 % (Used: 12923.24 MB, Total_size: 38970.85 MB, Free: 24061.47 MB)
Date/Time: 2020-07-08 11:31:47
You do not have the required permissions to view the files attached to this post.
-
benjaminsmith
- Posts: 5324
- Joined: Wed Aug 22, 2018 4:39 pm
- Location: saint paul
Re: Multiple events created for the same alert.
Hi,
So here is the full configuration for that service in Core "speak". So it's going send notifications to opsgenie and the unix_contact_group every hour if the service is critical, and also on recovery.
Here is the log output for the service:
If you look the timestamps you'll see that it is notifying the contacts every hour, so this would be expected. If you do not want to receive additional notifications, you can set the notification_interval to 0 and Nagios will only send out one notification otherwise I would increase the interfaval to a longer time period.
Hope that helps and let me know if you have any questions.
So here is the full configuration for that service in Core "speak". So it's going send notifications to opsgenie and the unix_contact_group every hour if the service is critical, and also on recovery.
Code: Select all
define service {
host_name sldgbd0065
service_description Disk Usage on /dbawork
check_period 24x7
check_command check_xi_ncpa!-t 'etb00W7XKrj79dpj3xx158nQ0yCG8Ho1' -P 5693 -M 'disk/logical/|dbawork' -u M -w 90 -c 95!!!!!!!
contacts opsgenie
contact_groups xi_unix_contact_group
notification_period xi_timeperiod_24x7
initial_state o
importance 0
check_interval 5.000000
retry_interval 1.000000
max_check_attempts 5
is_volatile 0
parallelize_check 1
active_checks_enabled 1
passive_checks_enabled 1
obsess 1
event_handler_enabled 1
low_flap_threshold 0.000000
high_flap_threshold 0.000000
flap_detection_enabled 1
flap_detection_options a
freshness_threshold 0
check_freshness 0
notification_options r,c
notifications_enabled 1
notification_interval 60.000000
first_notification_delay 0.000000
stalking_options n
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
_OPSGENIETEAMS Unix
}
Code: Select all
[1594183783] SERVICE NOTIFICATION: gaujf010;sldgbd0065;Disk Usage on /dbawork;CRITICAL;xi_service_notification_handler;CRITICAL: Used disk space was 100.00 % (Used: 36967.92 MB, Total_size: 38970.85 MB, Free: 16.78 MB)
[1594183783] SERVICE NOTIFICATION: support.aix;sldgbd0065;Disk Usage on /dbawork;CRITICAL;xi_service_notification_handler;CRITICAL: Used disk space was 100.00 % (Used: 36967.92 MB, Total_size: 38970.85 MB, Free: 16.78 MB)
[1594183783] SERVICE NOTIFICATION: opsgenie;sldgbd0065;Disk Usage on /dbawork;CRITICAL;notify-service-by-opsgenie;CRITICAL: Used disk space was 100.00 % (Used: 36967.92 MB, Total_size: 38970.85 MB, Free: 16.78 MB)
[1594187666] SERVICE NOTIFICATION: gaujf010;sldgbd0065;Disk Usage on /dbawork;CRITICAL;xi_service_notification_handler;CRITICAL: Used disk space was 100.00 % (Used: 36967.92 MB, Total_size: 38970.85 MB, Free: 16.78 MB)
[1594187666] SERVICE NOTIFICATION: support.aix;sldgbd0065;Disk Usage on /dbawork;CRITICAL;xi_service_notification_handler;CRITICAL: Used disk space was 100.00 % (Used: 36967.92 MB, Total_size: 38970.85 MB, Free: 16.78 MB)
[1594187666] SERVICE NOTIFICATION: opsgenie;sldgbd0065;Disk Usage on /dbawork;CRITICAL;notify-service-by-opsgenie;CRITICAL: Used disk space was 100.00 % (Used: 36967.92 MB, Total_size: 38970.85 MB, Free: 16.78 MB)
[1594191549] SERVICE NOTIFICATION: gaujf010;sldgbd0065;Disk Usage on /dbawork;CRITICAL;xi_service_notification_handler;CRITICAL: Used disk space was 100.00 % (Used: 36967.92 MB, Total_size: 38970.85 MB, Free: 16.78 MB)
[1594191549] SERVICE NOTIFICATION: support.aix;sldgbd0065;Disk Usage on /dbawork;CRITICAL;xi_service_notification_handler;CRITICAL: Used disk space was 100.00 % (Used: 36967.92 MB, Total_size: 38970.85 MB, Free: 16.78 MB)
[1594191549] SERVICE NOTIFICATION: opsgenie;sldgbd0065;Disk Usage on /dbawork;CRITICAL;notify-service-by-opsgenie;CRITICAL: Used disk space was 100.00 % (Used: 36967.92 MB, Total_size: 38970.85 MB, Free: 16.78 MB)
[1594195434] SERVICE NOTIFICATION: gaujf010;sldgbd0065;Disk Usage on /dbawork;CRITICAL;xi_service_notification_handler;CRITICAL: Used disk space was 100.00 % (Used: 36967.92 MB, Total_size: 38970.85 MB, Free: 16.78 MB)
[1594195434] SERVICE NOTIFICATION: support.aix;sldgbd0065;Disk Usage on /dbawork;CRITICAL;xi_service_notification_handler;CRITICAL: Used disk space was 100.00 % (Used: 36967.92 MB, Total_size: 38970.85 MB, Free: 16.78 MB)
[1594195434] SERVICE NOTIFICATION: opsgenie;sldgbd0065;Disk Usage on /dbawork;CRITICAL;notify-service-by-opsgenie;CRITICAL: Used disk space was 100.00 % (Used: 36967.92 MB, Total_size: 38970.85 MB, Free: 16.78 MB)
[1594199319] SERVICE NOTIFICATION: gaujf010;sldgbd0065;Disk Usage on /dbawork;CRITICAL;xi_service_notification_handler;CRITICAL: Used disk space was 100.00 % (Used: 36967.92 MB, Total_size: 38970.85 MB, Free: 16.78 MB)
Hope that helps and let me know if you have any questions.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: Multiple events created for the same alert.
Hi. I wish it could only be that or I just don't get it. You see, support.aix received 10 emails for the alerts within 2-3 seconds.
I've included a screenshot of that inbox. Not exactly the same alert for the same host, but same problem.
I've included a screenshot of that inbox. Not exactly the same alert for the same host, but same problem.
You do not have the required permissions to view the files attached to this post.
-
benjaminsmith
- Posts: 5324
- Joined: Wed Aug 22, 2018 4:39 pm
- Location: saint paul
Re: Multiple events created for the same alert.
Hi,
Right now, I'm not seeing multiple service notification in the Nagios log, so it's likely an issue with the mail setup or the event queue in XI. When you open those emails, are do they have the exact same Date/Timestamp (are they duplicates). Also, Is this issue affecting all XI users accounts?
Can your try to send a custom notification for this host, and let me know if you receive more than one message?
Right now, I'm not seeing multiple service notification in the Nagios log, so it's likely an issue with the mail setup or the event queue in XI. When you open those emails, are do they have the exact same Date/Timestamp (are they duplicates). Also, Is this issue affecting all XI users accounts?
Can your try to send a custom notification for this host, and let me know if you receive more than one message?
You do not have the required permissions to view the files attached to this post.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!