For some service we are only getting recovery mail

An open discussion forum for obtaining help with Nagios Core. Nagios Core users of all experience levels are welcome here. Subforum have been created for the discussion of Nagios Core and Nagios Plugin development.

NOTE: The SourceForge.net mailing lists have been deprecated in favor of this forum in order to expedite support and provide additional features not available on the old mailing list.

For some service we are only getting recovery mail

Postby chintan1511 » Thu Jun 20, 2019 6:19 am

Hi Team,

We are using Nagios Core for the last 4 months. It's working very well. Currently, We are getting only recovery mail for some VM's service. It continuous trigger. As we checked on logs, we are getting critical socket timeout for that service. but it only sends recovery mail. We are not getting any problem notification.

Can anyone help us?
chintan1511
 
Posts: 5
Joined: Tue Jun 18, 2019 5:26 am

Re: For some service we are only getting recovery mail

Postby scottwilkerson » Thu Jun 20, 2019 7:07 am

What version of Nagios Core are you using?

There were a few bugs related to this that should be resolved in 4.4.3
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
User avatar
scottwilkerson
DevOps Engineer
 
Posts: 17020
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises

Re: For some service we are only getting recovery mail

Postby chintan1511 » Fri Jun 21, 2019 5:15 am

Hi Team,

Thanks for the reply. Currently, we are working on 4.4.2.

Here is my log detail:
[1560848682] HOST ALERT: myProdVM;DOWN;SOFT;1;TCP CRITICAL - Invalid hostname, address or socket: myProdVM.url.com
[1560848727] SERVICE ALERT: myProdVM;C:\ Drive Space;CRITICAL;HARD;1;CRITICAL - Socket timeout
[1560848728] SERVICE ALERT: myProdVM;CPU Load;CRITICAL;HARD;1;CRITICAL - Socket timeout
[1560848744] HOST ALERT: myProdVM;UP;SOFT;1;TCP OK - 0.001 second response time on myProdVM.url.com port 12489
[1560848848] SERVICE NOTIFICATION: nagiosadmin;myProdVM;C:\ Drive Space;OK;notify-service-by-email;c:\ - total: 126.51 Gb - used: 55.41 Gb (44%) - free 71.10 Gb (56%)
[1560848848] SERVICE NOTIFICATION: nagiosadmin2;myProdVM;C:\ Drive Space;OK;notify-service-by-email;c:\ - total: 126.51 Gb - used: 55.41 Gb (44%) - free 71.10 Gb (56%)
[1560848848] SERVICE ALERT: myProdVM;C:\ Drive Space;OK;HARD;1;c:\ - total: 126.51 Gb - used: 55.41 Gb (44%) - free 71.10 Gb (56%)
[1560848848] SERVICE NOTIFICATION: nagiosadmin;myProdVM;CPU Load;OK;notify-service-by-email;CPU Load 2% (5 min average)
[1560848848] SERVICE NOTIFICATION: nagiosadmin2;myProdVM;CPU Load;OK;notify-service-by-email;CPU Load 2% (5 min average)
[1560848848] SERVICE ALERT: myProdVM;CPU Load;OK;HARD;1;CPU Load 2% (5 min average)
(Note: Changins HostName and host URL.)

The issue was not happening for starting 3 months. Currently, we are getting recovery mail for some service.

Is that same bug which can solve on 4.4.3? Could you please elaborate on the issue?
chintan1511
 
Posts: 5
Joined: Tue Jun 18, 2019 5:26 am

Re: For some service we are only getting recovery mail

Postby scottwilkerson » Fri Jun 21, 2019 6:31 am

chintan1511 wrote:The issue was not happening for starting 3 months. Currently, we are getting recovery mail for some service.

Is that same bug which can solve on 4.4.3? Could you please elaborate on the issue?


Yes that looks like it could be it

https://github.com/NagiosEnterprises/nagioscore/blob/master/Changelog

Specifically caused by these
Code: Select all
* Fixed services sending recovery emails when they recover if host in down state (#572) (Scott Wilkerson)
* Fixed services in soft states sometimes not switching into hard states (#576) (Jake Omann)
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
User avatar
scottwilkerson
DevOps Engineer
 
Posts: 17020
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises

Re: For some service we are only getting recovery mail

Postby chintan1511 » Wed Jun 26, 2019 9:07 am

Hi Team,

I upgraded with 4.4.3 version.

I am still getting recovery mail. Also, there is another issue. For some service, I got problem notification but not getting recovery mail.
chintan1511
 
Posts: 5
Joined: Tue Jun 18, 2019 5:26 am

Re: For some service we are only getting recovery mail

Postby mcapra » Thu Jun 27, 2019 2:00 pm

The \ characters in the service status could be tripping some things up. Do the notifications you're experiencing problems with all related to Windows disk checks?

Depending on the underlying notification command you have defined, this un-escaped character could be causing issues when the command fully evaluates and is executed.
Former Nagios employee
http://www.mcapra.com/
User avatar
mcapra
 
Posts: 3587
Joined: Thu May 05, 2016 3:54 pm

Re: For some service we are only getting recovery mail

Postby cdienger » Thu Jun 27, 2019 3:33 pm

Are you only seeing this with Windows disk checks as @mcapra asked about?

I'd be curious to see the configuration files for these hosts and services to see if we can reproduce it here.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
User avatar
cdienger
Support Tech
 
Posts: 3774
Joined: Tue Feb 07, 2017 11:26 am

Re: For some service we are only getting recovery mail

Postby chintan1511 » Fri Jul 12, 2019 4:28 am

Hi Team,

Sorry for the delay in reply.
Nope. We are also getting alert for memory usage, CPU Load, etc.
For that, can I share log details? Is there any other way without sending Host and service files? As per our policy, we can't share our data.

Thanks for helping.
chintan1511
 
Posts: 5
Joined: Tue Jun 18, 2019 5:26 am

Re: For some service we are only getting recovery mail

Postby ssax » Fri Jul 12, 2019 3:50 pm

You can show us the nagios.log entries for the host AND this hosts services (after the upgrade) so that we can see what exactly is occurring (we need to see all HARD/SOFT states for BOTH the HOST AND the SERVICES over this timeperiod so that we can see what state they are all in when things occur).

What do the services show for the notification_options in your objects.cache? The contacts? The host?

- Check the HOST notification_options to make sure you have Warning, Critical, and Recovery selected
- Check the SERVICE notification_options to make sure you have Warning, Critical, and Recovery selected
- Check the CONTACT definitions and make sure they have Warning, Critical, and Recovery selected
Be sure to check out our Knowledgebase for helpful articles and solutions!
User avatar
ssax
Dreams In Code
 
Posts: 4384
Joined: Wed Feb 11, 2015 12:54 pm

Re: For some service we are only getting recovery mail

Postby chintan1511 » Thu Jul 18, 2019 4:49 am

Here is the log details:
[1563423742] SERVICE ALERT: myProdVM;Nagios Client;CRITICAL;SOFT;1;CRITICAL - Socket timeout
[1563423753] HOST ALERT: myProdVM;DOWN;SOFT;1;CRITICAL - Socket timeout
[1563423805] HOST ALERT: myProdVM;UP;SOFT;1;TCP OK - 0.001 second response time on url.domain.com port 12489
[1563423892] SERVICE ALERT: myProdVM;Nagios Client;CRITICAL;SOFT;2;CRITICAL - Socket timeout
[1563423920] SERVICE ALERT: myProdVM;nssm;CRITICAL;SOFT;1;CRITICAL - Socket timeout
[1563424012] SERVICE NOTIFICATION: nagiosadmin;myProdVM;Nagios Client;OK;notify-service-by-email;nscp.exe: Running
[1563424012] SERVICE NOTIFICATION: nagiosadmin2;myProdVM;Nagios Client;OK;notify-service-by-email;nscp.exe: Running
[1563424012] SERVICE NOTIFICATION: nagiosadmin3;myProdVM;Nagios Client;OK;notify-service-by-email;nscp.exe: Running
[1563424012] SERVICE NOTIFICATION: nagiosadmin4;myProdVM;Nagios Client;OK;notify-service-by-email;nscp.exe: Running
[1563424012] SERVICE ALERT: myProdVM;Nagios Client;OK;HARD;3;nscp.exe: Running
[1563424040] SERVICE ALERT: myProdVM;nssm;OK;SOFT;2;nssm.exe: Running
Note: (Note: Changing on HostName and host URL.)

Here are the details on Notification_optiorn on Host and services.
servie notification_options : w,u,c,r
Host notification_options: d,r
contact Defination: service_notification_options w,u,c,r,f,s
host_notification_options d,u,r,f,s

Could you please help where we can change. So, we will not get only recovery mail for some services.

Thanks for your support.
chintan1511
 
Posts: 5
Joined: Tue Jun 18, 2019 5:26 am

Next

Return to Nagios Core

Who is online

Users browsing this forum: No registered users and 21 guests