NagiosXI Notification Problem
-
davide.bonicelli
- Posts: 134
- Joined: Thu Feb 13, 2014 5:12 am
NagiosXI Notification Problem
Hi, i saw a notification problem with our NagiosXI 5.7.3 with some services and hosts
for example i've got only this notification for this service in unknown state and this is from the service state history This is the service configuration What could be wrong?
for example i've got only this notification for this service in unknown state and this is from the service state history This is the service configuration What could be wrong?
You do not have the required permissions to view the files attached to this post.
-
benjaminsmith
- Posts: 5324
- Joined: Wed Aug 22, 2018 4:39 pm
- Location: saint paul
Re: NagiosXI Notification Problem
Hi @davide.bonicelli ,
So this would be expected behavior, notifications are sent on hard state changes. The service went into a hard state at 17:26 and the notification was sent a few minutes later. Since the notification interval is set to 480, I would check the nagios.log for any state changes after 18:00 to determine if another notification should have been sent. The nagios logs are in the following directory:
Also, it does appear to be moving quickly from unknown to ok states. If flapping is detected (and enabled), notifications would be suppressed.
--Benjamin
So this would be expected behavior, notifications are sent on hard state changes. The service went into a hard state at 17:26 and the notification was sent a few minutes later. Since the notification interval is set to 480, I would check the nagios.log for any state changes after 18:00 to determine if another notification should have been sent. The nagios logs are in the following directory:
Code: Select all
/usr/local/nagios/var/archives
--Benjamin
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
-
davide.bonicelli
- Posts: 134
- Joined: Thu Feb 13, 2014 5:12 am
Re: NagiosXI Notification Problem
yeah, maybe could be a flapping but i think i'd had a notification..
This is the mail from 30/09/20 to today No any Recover mail even if the service is in green..and the flapping mails for the user are on..
This is the mail from 30/09/20 to today No any Recover mail even if the service is in green..and the flapping mails for the user are on..
You do not have the required permissions to view the files attached to this post.
-
benjaminsmith
- Posts: 5324
- Joined: Wed Aug 22, 2018 4:39 pm
- Location: saint paul
Re: NagiosXI Notification Problem
Hi,
Can you run the State History Report for that same service over the same time period, 9/29 to 10/12? Be sure to select State Type = Both and State = Any, for the report options. This will help us compare the notifications report to the service.
As a side note, I noticed that some of the checks are timing out after 45 seconds to 60 seconds, and that would suggest network issues. You can adjust the max check attempts and retry interval to help reduce the number false positives on the service.
--Benjamin
Can you run the State History Report for that same service over the same time period, 9/29 to 10/12? Be sure to select State Type = Both and State = Any, for the report options. This will help us compare the notifications report to the service.
As a side note, I noticed that some of the checks are timing out after 45 seconds to 60 seconds, and that would suggest network issues. You can adjust the max check attempts and retry interval to help reduce the number false positives on the service.
--Benjamin
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
-
davide.bonicelli
- Posts: 134
- Joined: Thu Feb 13, 2014 5:12 am
Re: NagiosXI Notification Problem
Thanks for the reply, i downloaded a csv because there are almost 400 entries
we've some internet problems so there are so many timeout problems..but the point here is to know where they come back green
we've some internet problems so there are so many timeout problems..but the point here is to know where they come back green
You do not have the required permissions to view the files attached to this post.
-
benjaminsmith
- Posts: 5324
- Joined: Wed Aug 22, 2018 4:39 pm
- Location: saint paul
Re: NagiosXI Notification Problem
Hi,
Thanks for the report. So since the hard OK state on 9/30 at 17:26, there service has been bouncing back and forth between soft unknown and soft OK and that's why you haven't received the notifciaton.
Let's try to determine why it's frequently timing out so frequently, is this a network issue? Try running a series of ping checks to test the reliability of the network connection.
Regards,
Benjamin
Thanks for the report. So since the hard OK state on 9/30 at 17:26, there service has been bouncing back and forth between soft unknown and soft OK and that's why you haven't received the notifciaton.
Code: Select all
2020-09-30 18:01:20,"NAS-BK-ARIOST2","Uptime",1,"UNKNOWN","SOFT",1,5,OK,OK,"UNKNOWN - check_nwc_health timed out after 60 seconds"
2020-09-30 17:35:34,"NAS-BK-ARIOST2","Uptime",1,"OK","SOFT",1,5,UNKNOWN,UNKNOWN,"OK - device is up since 6d 3h 24m 48s"
2020-09-30 17:26:31,"NAS-BK-ARIOST2","Uptime",1,"UNKNOWN","HARD",1,5,OK,OK,"UNKNOWN - check_nwc_health timed out after 60 seconds"
Code: Select all
ping -c 50 <ip address> > ping_test.txt
Benjamin
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
-
davide.bonicelli
- Posts: 134
- Joined: Thu Feb 13, 2014 5:12 am
Re: NagiosXI Notification Problem
yep, i know that we've network problem with this device but the point is that we haven't received an Ok Message since 09/30
last 24 hours state history
why it's only at 2 soft attempt?
isn't it stable for 17 hours?
isn't it stable for 17 hours?
You do not have the required permissions to view the files attached to this post.
-
benjaminsmith
- Posts: 5324
- Joined: Wed Aug 22, 2018 4:39 pm
- Location: saint paul
Re: NagiosXI Notification Problem
Hi @davide.bonicelli,
Let's get a fresh system profile from this machine to rule out any chance there could be an NDO (backend database) issue here.
There are a few limitations to the State History Report, so in order to find out exactly what is going on with this service, I would like to get a zip file of the last few weeks of Nagios.logs. You'll find them in the following directory.
If your system is not very old you could probably just zip up the entire directory as it will not have too many logs ( nothing is ever deleted). Otherwise, the logs from 9/29 to now would be helpful. Thanks, Benjamin
To send us your system profile.
Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and share in a private message or upload it to the post/ticket, and then reply to this post to bring it up in the queue.
Let's get a fresh system profile from this machine to rule out any chance there could be an NDO (backend database) issue here.
There are a few limitations to the State History Report, so in order to find out exactly what is going on with this service, I would like to get a zip file of the last few weeks of Nagios.logs. You'll find them in the following directory.
Code: Select all
/usr/local/nagios/var/archives
To send us your system profile.
Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and share in a private message or upload it to the post/ticket, and then reply to this post to bring it up in the queue.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
-
davide.bonicelli
- Posts: 134
- Joined: Thu Feb 13, 2014 5:12 am
Re: NagiosXI Notification Problem
Here we go.
Thanks a lot!
Moderator's Note: The profile has been shared with the support team but has been removed from the public forum.
Thanks a lot!
Moderator's Note: The profile has been shared with the support team but has been removed from the public forum.
-
benjaminsmith
- Posts: 5324
- Joined: Wed Aug 22, 2018 4:39 pm
- Location: saint paul
Re: NagiosXI Notification Problem
Hi,
Since there a couple outstanding questions in this post, let's just work through the notifications for now. I searched through the logs, you should have receive notications on 10/1 as well. Here's a list of all the notifications sent for this service.
The configurations look right, so I would send a customer noticaiton or force a hard state change on the service by sending passive checks to make sure you are getting the emails. Follow the instructions below to force a state change and let me know if you successfully receive a message.
First, run the following tail command:
Then go to Home > Details > Host Status, find/search for the host, click on the host link and select the Advanced Tab. From the Commands menu submit passive check results (down) until the host goes into HARD Down state. This will trigger a notification.
Since there a couple outstanding questions in this post, let's just work through the notifications for now. I searched through the logs, you should have receive notications on 10/1 as well. Here's a list of all the notifications sent for this service.
Code: Select all
nagios-09-30-2020-00.log:[1601384580] SERVICE NOTIFICATION: monitoraggio-pro;NAS-BK-ARIOST2;Uptime;UNKNOWN;xi_service_notification_handler;UNKNOWN - check_nwc_health timed out after 45 seconds
nagios-09-30-2020-00.log:[1601385179] SERVICE NOTIFICATION: monitoraggio-pro;NAS-BK-ARIOST2;Uptime;FLAPPINGSTART (OK);xi_service_notification_handler;OK - device is up since 5d 1h 2m 16s
nagios-09-30-2020-00.log:[1601390165] SERVICE NOTIFICATION: monitoraggio-pro;NAS-BK-ARIOST2;Uptime;FLAPPINGSTOP (OK);xi_service_notification_handler;OK - device is up since 5d 2h 25m 21s
nagios-09-30-2020-00.log:[1601391585] SERVICE NOTIFICATION: monitoraggio-pro;NAS-BK-ARIOST2;Uptime;UNKNOWN;xi_service_notification_handler;UNKNOWN - check_nwc_health timed out after 45 seconds
nagios-09-30-2020-00.log:[1601391887] SERVICE NOTIFICATION: monitoraggio-pro;NAS-BK-ARIOST2;Uptime;OK;xi_service_notification_handler;OK - device is up since 5d 2h 54m 4s
nagios-10-01-2020-00.log:[1601452126] SERVICE NOTIFICATION: monitoraggio-pro;NAS-BK-ARIOST2;Uptime;UNKNOWN;xi_service_notification_handler;UNKNOWN - check_nwc_health timed out after 45 seconds
nagios-10-01-2020-00.log:[1601452427] SERVICE NOTIFICATION: monitoraggio-pro;NAS-BK-ARIOST2;Uptime;FLAPPINGSTART (OK);xi_service_notification_handler;OK - device is up since 5d 19h 43m 2s
nagios-10-01-2020-00.log:[1601457567] SERVICE NOTIFICATION: monitoraggio-pro;NAS-BK-ARIOST2;Uptime;FLAPPINGSTOP (OK);xi_service_notification_handler;OK - device is up since 5d 21h 8m 0s
nagios-10-01-2020-00.log:[1601479892] SERVICE NOTIFICATION: monitoraggio-pro;NAS-BK-ARIOST2;Uptime;UNKNOWN;xi_service_notification_handler;UNKNOWN - check_nwc_health timed out after 60 seconds
First, run the following tail command:
Code: Select all
tail -F /var/log/maillog /usr/local/nagiosxi/tmp/phpmailer.log /usr/local/nagiosxi/var/eventman.log /usr/local/nagios/var/nagios.log
You do not have the required permissions to view the files attached to this post.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!