Service notification - why?

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Post Reply
invade
Posts: 29
Joined: Thu Nov 16, 2017 7:45 am

Service notification - why?

Post by invade »

Hi

Is anyone able to explain why the following service notification was triggered:

Code: Select all

2022-08-10 00:00:00+01:00 CURRENT HOST STATE: host.example.com;UP;HARD;1;OK: 10-08-2022 @ 08:55:13 AEST
2022-08-10 00:00:00+01:00 CURRENT SERVICE STATE: host.example.com;Samba;OK;HARD;1;OK: smb.service unit is active - 10-08-2022 @ 08:55:55 AEST
2022-08-10 12:07:11+01:00 HOST ALERT: host.example.com;DOWN;SOFT;1;Remote command execution failed: ssh: connect to host host.example.com port 22: Connection refused
2022-08-10 12:10:50+01:00 SERVICE ALERT: host.example.com;Samba;UNKNOWN;HARD;1;Remote command execution failed: ssh: connect to host host.example.com port 22: Connection refused
2022-08-10 12:12:11+01:00 HOST ALERT: host.example.com;DOWN;SOFT;2;Remote command execution failed: ssh: connect to host host.example.com port 22: Connection refused
2022-08-10 12:14:05+01:00 HOST ALERT: host.example.com;UP;SOFT;1;OK: 10-08-2022 @ 21:14:05 AEST
2022-08-10 12:16:05+01:00 SERVICE NOTIFICATION: support;host.example.com;Samba;UNKNOWN;service_notification;UNKNOWN - Plugin timed out
2022-08-10 12:20:54+01:00 SERVICE ALERT: host.example.com;Samba;OK;SOFT;1;OK: smb.service unit is active - 10-08-2022 @ 21:20:54 AEST
We use the check_by_ssh plugin to perform active checks on a number of hosts & services.

Nagios host is running 4.4.6 on Rocky Linux 8.

For both types of check we use the following settings:

Code: Select all

max_check_attempts	13
retry_interval		5
In this case there was a ~7 minute period where the host was unavailable (which is usually a network problem).

There is no alert log entry for the first service check retry but, it looks like the notification was triggered after the first check retry failed, even though the max_check_attempts is set to 13 and the host check was OK at this point.

If you need any more information, just let me know.

Thanks in advance.
invade
Posts: 29
Joined: Thu Nov 16, 2017 7:45 am

Re: Service notification - why?

Post by invade »

Just to add that I have now enabled the “host_down_disable_service_checks” options, but I can still see service checks being run (and generating an alert) when the host is down. eg.

Code: Select all

2022-08-15 00:56:29+01:00 HOST ALERT: host.example.com;DOWN;SOFT;1;UNKNOWN - Plugin timed out
2022-08-15 00:56:38+01:00 SERVICE ALERT: host.example.com;Samba;UNKNOWN;HARD;1;UNKNOWN - Plugin timed out
rosydam
Posts: 2
Joined: Sun Aug 27, 2023 10:12 pm
Contact:

Re: Service notification - why?

Post by rosydam »

geometry dash
invade wrote: Mon Aug 15, 2022 3:44 am Just to add that I have now enabled the “host_down_disable_service_checks” options, but I can still see service checks being run (and generating an alert) when the host is down. eg.

Code: Select all

2022-08-15 00:56:29+01:00 HOST ALERT: host.example.com;DOWN;SOFT;1;UNKNOWN - Plugin timed out
2022-08-15 00:56:38+01:00 SERVICE ALERT: host.example.com;Samba;UNKNOWN;HARD;1;UNKNOWN - Plugin timed out
Why has a year passed without any answers? Have you resolved this issue yet?
Post Reply