Page 1 of 2

Chk services after host is up for x min (or other solution)

Posted: Wed Oct 22, 2014 4:10 am
by Dobi
Hello,

If a host goes to a “down” state and then gets back online again, so the host has the state “up” which actually allows Nagios to do some service checks and send notifications.

The problem is that those service checks based on NSClient++ will produce the error “Return code of 255 is out of bounds” because NSCP is not yet running.

How can I set a delay, which after the host is up again, to do the service checks?

I already thought about to choose a higher “max_check_attempts” or “retry_check_interval”, but than this is not possible with checks that are volatile and the “max_check_attempts” has to be set to 1.

Because I am monitoring many hosts I get over 30 notifications a day with “Return code of 255 is out of bounds” (which makes Nagios unusable).

How would you handle this?

Greetings,
Cédric

Re: Chk services after host is up for x min (or other soluti

Posted: Wed Oct 22, 2014 3:15 pm
by slansing
You could set up a service check to check if the nscp service is running, and assign all your other checks as service dependencies on that one check.

http://nagios.sourceforge.net/docs/nagi ... dependency

Re: Chk services after host is up for x min (or other soluti

Posted: Mon Oct 27, 2014 2:15 am
by Dobi
slansing wrote:You could set up a service check to check if the nscp service is running, and assign all your other checks as service dependencies on that one check.

http://nagios.sourceforge.net/docs/nagi ... dependency
Yes, I took a look at those, but if I understood this well, so I have to write for every host (200), which have about 5 to 10 services each which are dependent on nscp.exe, a service dependencie?
So I would have to write between 1000 an 2000 service dependencies? Or is there a possibilitiy to say sth. like: "all hosts from a specific group are all dependent on there own nscp.exe" or sth. in that direction? (Because those 1000-2000 wont be administrable.)

Re: Chk services after host is up for x min (or other soluti

Posted: Mon Oct 27, 2014 4:46 pm
by abrist
Well, as you are just using core, a script that alters all your config programmatically may be called for. . .

Re: Chk services after host is up for x min (or other soluti

Posted: Fri Oct 31, 2014 1:25 pm
by dariopalermo
I'm facing the same problem today. While looking on the web, I found this:

http://nagios.sourceforge.net/docs/3_0/ ... ricks.html

And from that page:

---
Service Dependency Definitions
.
.
.
Same Host Dependencies With Servicegroups:
If you want to create service dependencies for all services that belong in one or more servicegroups on a service on the same host running the dependent service, leave the host_name and hostgroup_name directives empty. The example below assumes that hosts running services that belong in SERVICEGROUP1 and SERVICEGROUP2 have the following service associated with them: SERVICE1. In this example, all services that belong in GROUPSERVICE1 and SERVICEGROUP2 will be dependent on SERVICE1 on the same host running the dependent service.


define servicedependency{

service_description SERVICE1

dependent_servicegroup_name SERVICEGROUP1,SERVICEGROUP2

other dependency directives ...

}
---

However, I couldn't be able to make it work. I've got a lab with just 2 hosts and 5 services per host, 4 dependant from nsclient++ version check. Nagios reports just 2 service dependencies (should be 8). Setting up 1 to 1 dependencies works fine (but, as you already noted, it's an hell of a job for a large number of host/services).

Is that kind of configuration actually working?

Bye, Dario

Re: Chk services after host is up for x min (or other soluti

Posted: Fri Oct 31, 2014 2:03 pm
by sreinhardt
I realize this is probably just an example, but just to be sure could you post a service and service group example of how you have implimented this, if this is not a direct example from your configs?

Re: Chk services after host is up for x min (or other soluti

Posted: Sat Nov 01, 2014 5:10 am
by dariopalermo
This is the actual config:

Code: Select all

define service{
	use			generic-service
	hostgroup_name		windows-servers
	service_description	NSClient++ Version
	check_command		check_nt!CLIENTVERSION
	}

define service{
	use			generic-service
	hostgroup_name		windows-servers
	service_description	Uptime
	check_command		check_nt!UPTIME
        servicegroups checknt_svcs
	}

define service{
	use			generic-service
	hostgroup_name		windows-servers
	service_description	CPU Load
	check_command		check_nt!CPULOAD!-l 5,80,90
        servicegroups checknt_svcs
	}

define service{
	use			generic-service
	hostgroup_name		windows-servers
	service_description	Memory Usage
	check_command		check_nt!MEMUSE!-w 80 -c 90
        servicegroups checknt_svcs
	}

define service{
	use			generic-service
	hostgroup_name		windows-servers
	service_description	C:\ Drive Space
	check_command		check_nt!USEDDISKSPACE!-l c -w 80 -c 90
        servicegroups checknt_svcs
	}

define servicegroup{
  servicegroup_name	checknt_svcs
  alias	Check_nt based services
}

define host{
	use		windows-server	; Inherit default values from a template
	host_name	srv_hansel	; The name we're giving to this host
	alias		Hansel	; A longer name associated with the host
	address		10.10.250.17	; IP address of the host
	}

define host{
	use		windows-server	; Inherit default values from a template
	host_name	srv_gretel	; The name we're giving to this host
	alias		Gretel	; A longer name associated with the host
	address		10.10.250.18	; IP address of the host
	}

define hostgroup{
	hostgroup_name	windows-servers	; The name of the hostgroup
	alias		Windows Servers	; Long name of the group
	}

define servicedependency{
	service_description NSClient++ Version
	dependent_servicegroup_name checknt_svcs
	execution_failure_criteria	n
	notification_failure_criteria w,u,c
	}
Bye, Dario

Re: Chk services after host is up for x min (or other soluti

Posted: Mon Nov 03, 2014 5:32 pm
by abrist
What version of core are both of you running?

Re: Chk services after host is up for x min (or other soluti

Posted: Mon Nov 03, 2014 7:35 pm
by dariopalermo
I'm running 4.0.8, fresh install.

Bye, Dario

Re: Chk services after host is up for x min (or other soluti

Posted: Tue Nov 04, 2014 5:43 pm
by tmcdonald
Does it work with the missing host_name and dependent_host_name parameters filled in directly?

http://nagios.sourceforge.net/docs/3_0/ ... dependency