Regarding Logging of state in alerts at every rechecks

shubham · Post by **shubham** » Wed Oct 31, 2018 2:58 pm

Hi team,
PLease help me in this to solve this query.
In our project we are using Nagios and its logs created by monitoring in Suppression Engine for ticketing tool.
During Our demo i need to create multiple alerts for a host as well as for its service.
In order to inc the frequency of the alerts i change the max check attempts ,check interval time,retry interval time in template.cfg for the host and and its services that i will be using in demo.
What i found was that i saw the service and host last check interval time and next check interval time in GUI which was changed,but when ever the command execute to check the change of state for that host/service there is not much alert seen in alert tab in nagios.
For example:
-i remove the lan cable of IP phone to create a critical alert
-i got an alert for host,IP Phone critical state that it is down
-then i got an alert for its service , ping critical state.

but after this i did not get any more alert in alert tab for ping service.
two more alerts for the host,ip phone only.

How to increase this ?
Why not i am seeing more alerts of such type when host and its service are being checked by an interval time,say 3 minutes .

Reason to get more alerts after every service checks, Since my Suppression engine logic will be based on rules such as if i got 4 critical alerts for a ping service of host and 2 down alerts for host then based on this and auto ticket much created and need to do some automation in it.

This is the reason i require more alerts at every service recheck if a host is down or service is in warning or critical state

Thank you
shubham

npolovenko · Post by **npolovenko** » Wed Oct 31, 2018 4:58 pm

Hello, @shubham. I believe the alerts menu gets entries from the nagios.log file. Alerts are only logged when a service or a host state changes. For example, soft-retry-1, soft-retry-2,soft-retry-3 and so on, until it reaches the max check attempts and enters a hard state. Then nagios will log that state as well and stop logging until the state changes again.

shubham · Post by **shubham** » Thu Nov 01, 2018 6:19 am

Hi ,
Thanks for the reply.
Is there any way to increase the soft states for a host/service to increase its logging and number of alerts.

npolovenko · Post by **npolovenko** » Thu Nov 01, 2018 10:39 am

@shubham, You could try changing the max_check_attempts option from 5 to say 99. So if the service is OK your ticketing system doesn't require alerts, but once it's in the Critical state it needs new alerts until the issue is resolved? Have you thought about setting up email notifications from the nagios server to the ticketing system instead?

shubham · Post by **shubham** » Thu Nov 01, 2018 12:28 pm

Hello,
We have but we specifically can't really on mails to the ticketing tool as we cant trust the mail server settings on client end because it will generate too many mail alerts which may impose load on mail server. Secondly it takes time to receive a mail and the took an action against it where as reading nagios log directly will do the work in fast way.
So as you said,are we sure if i increase the max check attempts from 5 to say 99
it will do my desired task.

Thanks,
Shubham.

npolovenko · Post by **npolovenko** » Thu Nov 01, 2018 4:41 pm

@shubham, I recommend trying this out on your own, but yes, it worked in my test environment.

State Type: Soft
Current Check: 6 of 99

[1530042610] SERVICE ALERT: localhost;HTTP;CRITICAL;SOFT;1;CRITICAL
[1530042646] SERVICE ALERT: localhost;HTTP;CRITICAL;SOFT;2;CRITICAL
[1530042653] SERVICE ALERT: localhost;HTTP;CRITICAL;SOFT;3;CRITICAL
[1530042659] SERVICE ALERT: localhost;HTTP;CRITICAL;SOFT;4;CRITICAL
[1530042665] SERVICE ALERT: localhost;HTTP;CRITICAL;SOFT;5;CRITICAL
[1530042671] SERVICE ALERT: localhost;HTTP;CRITICAL;SOFT;6;CRITICAL
[1530042723] SERVICE ALERT: localhost;HTTP;CRITICAL;SOFT;7;CRITICAL

shubham · Post by **shubham** » Fri Nov 02, 2018 2:13 am

Hello,
Thanks for the solution .
I'll try and update the result.

Thanks
Shubham.

shubham · Post by **shubham** » Fri Nov 02, 2018 6:41 am

Hello there,
So now what i have found out is that.when i set my max check attempts to 30 for host.
i got logs for all 30 attempts.
when i set max check attempts to 20 for the ping service for the host.
i only got one service alert in the logs.
so could you help me in this.

What i have researched is this (please correct me)
i read on internet is that if the service and host is both down in nagios
by logically nagios won't check the service as it knows when the host down is down
why to check its service
Correct me if i am wrong.

thanks,
Shubham

npolovenko · Post by **npolovenko** » Fri Nov 02, 2018 10:11 am

@shubham, Yeah, that is true. How many services do you need to monitor with the Suppression Engine? If just a few, we could convert these services to hosts.

shubham · Post by **shubham** » Sat Nov 03, 2018 12:21 pm

Hello,
Well there no selected services as the nagios will be delivered to the client if approved as NMS tool for them and if they have n number of devices with them and then consider there would be n number of services too which will be needed to monitor.
How to monitor service as host? can you please clarify.

Thanks,
Shubham..

Nagios Support Forum

Regarding Logging of state in alerts at every rechecks

Regarding Logging of state in alerts at every rechecks

Re: Regarding Logging of state in alerts at every rechecks

Re: Regarding Logging of state in alerts at every rechecks

Re: Regarding Logging of state in alerts at every rechecks

Re: Regarding Logging of state in alerts at every rechecks

Re: Regarding Logging of state in alerts at every rechecks

Re: Regarding Logging of state in alerts at every rechecks

Re: Regarding Logging of state in alerts at every rechecks

Re: Regarding Logging of state in alerts at every rechecks

Re: Regarding Logging of state in alerts at every rechecks