Page 1 of 2
Chk services after host is up for x min (or other solution)
Posted: Wed Oct 22, 2014 4:10 am
by Dobi
Hello,
If a host goes to a “down” state and then gets back online again, so the host has the state “up” which actually allows Nagios to do some service checks and send notifications.
The problem is that those service checks based on NSClient++ will produce the error “Return code of 255 is out of bounds” because NSCP is not yet running.
How can I set a delay, which after the host is up again, to do the service checks?
I already thought about to choose a higher “max_check_attempts” or “retry_check_interval”, but than this is not possible with checks that are volatile and the “max_check_attempts” has to be set to 1.
Because I am monitoring many hosts I get over 30 notifications a day with “Return code of 255 is out of bounds” (which makes Nagios unusable).
How would you handle this?
Greetings,
Cédric
Re: Chk services after host is up for x min (or other soluti
Posted: Wed Oct 22, 2014 3:15 pm
by slansing
You could set up a service check to check if the nscp service is running, and assign all your other checks as service dependencies on that one check.
http://nagios.sourceforge.net/docs/nagi ... dependency
Re: Chk services after host is up for x min (or other soluti
Posted: Mon Oct 27, 2014 2:15 am
by Dobi
Yes, I took a look at those, but if I understood this well, so I have to write for every host (200), which have about 5 to 10 services each which are dependent on nscp.exe, a service dependencie?
So I would have to write between 1000 an 2000 service dependencies? Or is there a possibilitiy to say sth. like: "all hosts from a specific group are all dependent on there own nscp.exe" or sth. in that direction? (Because those 1000-2000 wont be administrable.)
Re: Chk services after host is up for x min (or other soluti
Posted: Mon Oct 27, 2014 4:46 pm
by abrist
Well, as you are just using core, a script that alters all your config programmatically may be called for. . .
Re: Chk services after host is up for x min (or other soluti
Posted: Fri Oct 31, 2014 1:25 pm
by dariopalermo
I'm facing the same problem today. While looking on the web, I found this:
http://nagios.sourceforge.net/docs/3_0/ ... ricks.html
And from that page:
---
Service Dependency Definitions
.
.
.
Same Host Dependencies With Servicegroups:
If you want to create service dependencies for all services that belong in one or more servicegroups on a service on the same host running the dependent service, leave the host_name and hostgroup_name directives empty. The example below assumes that hosts running services that belong in SERVICEGROUP1 and SERVICEGROUP2 have the following service associated with them: SERVICE1. In this example, all services that belong in GROUPSERVICE1 and SERVICEGROUP2 will be dependent on SERVICE1 on the same host running the dependent service.
define servicedependency{
service_description SERVICE1
dependent_servicegroup_name SERVICEGROUP1,SERVICEGROUP2
other dependency directives ...
}
---
However, I couldn't be able to make it work. I've got a lab with just 2 hosts and 5 services per host, 4 dependant from nsclient++ version check. Nagios reports just 2 service dependencies (should be 8). Setting up 1 to 1 dependencies works fine (but, as you already noted, it's an hell of a job for a large number of host/services).
Is that kind of configuration actually working?
Bye, Dario
Re: Chk services after host is up for x min (or other soluti
Posted: Fri Oct 31, 2014 2:03 pm
by sreinhardt
I realize this is probably just an example, but just to be sure could you post a service and service group example of how you have implimented this, if this is not a direct example from your configs?
Re: Chk services after host is up for x min (or other soluti
Posted: Sat Nov 01, 2014 5:10 am
by dariopalermo
This is the actual config:
Code: Select all
define service{
use generic-service
hostgroup_name windows-servers
service_description NSClient++ Version
check_command check_nt!CLIENTVERSION
}
define service{
use generic-service
hostgroup_name windows-servers
service_description Uptime
check_command check_nt!UPTIME
servicegroups checknt_svcs
}
define service{
use generic-service
hostgroup_name windows-servers
service_description CPU Load
check_command check_nt!CPULOAD!-l 5,80,90
servicegroups checknt_svcs
}
define service{
use generic-service
hostgroup_name windows-servers
service_description Memory Usage
check_command check_nt!MEMUSE!-w 80 -c 90
servicegroups checknt_svcs
}
define service{
use generic-service
hostgroup_name windows-servers
service_description C:\ Drive Space
check_command check_nt!USEDDISKSPACE!-l c -w 80 -c 90
servicegroups checknt_svcs
}
define servicegroup{
servicegroup_name checknt_svcs
alias Check_nt based services
}
define host{
use windows-server ; Inherit default values from a template
host_name srv_hansel ; The name we're giving to this host
alias Hansel ; A longer name associated with the host
address 10.10.250.17 ; IP address of the host
}
define host{
use windows-server ; Inherit default values from a template
host_name srv_gretel ; The name we're giving to this host
alias Gretel ; A longer name associated with the host
address 10.10.250.18 ; IP address of the host
}
define hostgroup{
hostgroup_name windows-servers ; The name of the hostgroup
alias Windows Servers ; Long name of the group
}
define servicedependency{
service_description NSClient++ Version
dependent_servicegroup_name checknt_svcs
execution_failure_criteria n
notification_failure_criteria w,u,c
}
Bye, Dario
Re: Chk services after host is up for x min (or other soluti
Posted: Mon Nov 03, 2014 5:32 pm
by abrist
What version of core are both of you running?
Re: Chk services after host is up for x min (or other soluti
Posted: Mon Nov 03, 2014 7:35 pm
by dariopalermo
I'm running 4.0.8, fresh install.
Bye, Dario
Re: Chk services after host is up for x min (or other soluti
Posted: Tue Nov 04, 2014 5:43 pm
by tmcdonald
Does it work with the missing host_name and dependent_host_name parameters filled in directly?
http://nagios.sourceforge.net/docs/3_0/ ... dependency