Help with service dependencies.
-
- Posts: 12
- Joined: Fri Feb 06, 2015 3:58 pm
Help with service dependencies.
Hi everyone, I have what I think is a small issue that I can't figure out. I think I know what the problem is, but i'll get into that shortly. I am having trouble setting up my service dependency. Here is what i have in my dependecies.cfg file
define servicedependency{
host_name server01
service_description service_Tomcat
dependent_host_name server01
dependent_service_description service01,service02,service03,service04
execution_failure_criteria w,u,c
notification_failure_criteria w,u,c
}
The services(service01, service02 etc) are dependent on service_tomcat. They are url's. If tomcat goes down, we know that the four services are unreachable. However, it doesn't work when i stop tomcat.
So i've noticed in nagios when the services are retrying, they get to their max attempt and then send a notification. So I figured I could just reduce the attempts service_tomcat would make. However, I don't know where to make that change, as I can't find it anywhere, so I'm assuming that nagios will still treat it as a regular service. I've tried making changes in the templates file and our main nagios file, but it doesn't do anything.
I was thinking that if the services reached it's max attempts and service_tomcat was still checking, it would still at least suppress the notifications. I think there is a 'p' for the execution_failure_criteria that I am going to try now, but otherwise, i have no clue what i am missing or overlooking. thanks in advance
define servicedependency{
host_name server01
service_description service_Tomcat
dependent_host_name server01
dependent_service_description service01,service02,service03,service04
execution_failure_criteria w,u,c
notification_failure_criteria w,u,c
}
The services(service01, service02 etc) are dependent on service_tomcat. They are url's. If tomcat goes down, we know that the four services are unreachable. However, it doesn't work when i stop tomcat.
So i've noticed in nagios when the services are retrying, they get to their max attempt and then send a notification. So I figured I could just reduce the attempts service_tomcat would make. However, I don't know where to make that change, as I can't find it anywhere, so I'm assuming that nagios will still treat it as a regular service. I've tried making changes in the templates file and our main nagios file, but it doesn't do anything.
I was thinking that if the services reached it's max attempts and service_tomcat was still checking, it would still at least suppress the notifications. I think there is a 'p' for the execution_failure_criteria that I am going to try now, but otherwise, i have no clue what i am missing or overlooking. thanks in advance
- Box293
- Too Basu
- Posts: 5126
- Joined: Sun Feb 07, 2010 10:55 pm
- Location: Deniliquin, Australia
- Contact:
Re: Help with service dependencies.
Your configuration is correct.
The problem lies in your service_Tomcat.
This service needs to enter a HARD state before the dependencies take affect.
For example if your service_Tomcat was:
check_interval 5
retry_interval 1
max_check_attempts 5
AND your other services were:
check_interval 1
retry_interval 1
max_check_attempts 5
Then it would take service_Tomcat up to 9 minutes before it enters a HARD state. During this time the other services will continue to execute.
The problem lies in your service_Tomcat.
This service needs to enter a HARD state before the dependencies take affect.
For example if your service_Tomcat was:
check_interval 5
retry_interval 1
max_check_attempts 5
AND your other services were:
check_interval 1
retry_interval 1
max_check_attempts 5
Then it would take service_Tomcat up to 9 minutes before it enters a HARD state. During this time the other services will continue to execute.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
-
- Posts: 12
- Joined: Fri Feb 06, 2015 3:58 pm
Re: Help with service dependencies.
Box293 wrote:Your configuration is correct.
The problem lies in your service_Tomcat.
This service needs to enter a HARD state before the dependencies take affect.
For example if your service_Tomcat was:
check_interval 5
retry_interval 1
max_check_attempts 5
AND your other services were:
check_interval 1
retry_interval 1
max_check_attempts 5
Then it would take service_Tomcat up to 9 minutes before it enters a HARD state. During this time the other services will continue to execute.
Thanks for the reply. I had monitored the logs to see what was happening, and I knew it had something to do with the hard/soft states, but I wasn't sure where to edit that. I actually edited the file I needed to do that, but I guess I should have set it to 1. Ill give that a try now and see what happens and let you know. thanks again
-
- Posts: 12
- Joined: Fri Feb 06, 2015 3:58 pm
Re: Help with service dependencies.
Box293 wrote:Your configuration is correct.
The problem lies in your service_Tomcat.
This service needs to enter a HARD state before the dependencies take affect.
For example if your service_Tomcat was:
check_interval 5
retry_interval 1
max_check_attempts 5
AND your other services were:
check_interval 1
retry_interval 1
max_check_attempts 5
Then it would take service_Tomcat up to 9 minutes before it enters a HARD state. During this time the other services will continue to execute.
I'm actually having a bit difficulty on where to make the change. Well, I think i do know where to make the change, but I think there is some conflict. I tried this:
extra_service_conf["max_check_attempts"] = [
( "5", ALL_HOSTS, ALL_SERVICES ),
( "3", ["server01"], ["service_Tomcat"] )
]
and while nagios successfully compiles nagios or what not, it doesn't actually update to '2' for server01
- Box293
- Too Basu
- Posts: 5126
- Joined: Sun Feb 07, 2010 10:55 pm
- Location: Deniliquin, Australia
- Contact:
Re: Help with service dependencies.
The setting for service_Tomcat need to be defined specifically in the service definition.
For example (this is taken from a Nagios XI box but the settings are the same):
You want to change these to the values you want.
extra_service_conf appears to be a check_mk thing, which you will need to go to the check_mk support forums to get answers on this.
For example (this is taken from a Nagios XI box but the settings are the same):
Code: Select all
define service {
host_name 10.25.14.2
service_description CPU Usage
use xiwizard_windowswmi_service
check_command check_xi_service_wmiplus!'yyyyy'!'xxxxx'!checkcpu!-w '80' -c '90'!!!!
max_check_attempts 5
check_interval 5
retry_interval 1
check_period xi_timeperiod_24x7
notification_interval 60
notification_period xi_timeperiod_24x7
contacts nagiosadmin
notes_url http://notes.com
_xiwizard windowswmi
register 1
}
Code: Select all
max_check_attempts 5
check_interval 5
retry_interval 1
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
-
- Posts: 12
- Joined: Fri Feb 06, 2015 3:58 pm
Re: Help with service dependencies.
Yea I've tried editing that file that contains the 'define service...' but main.mk will overwrite whatever i did there and replace what the main.mk has. I will post in the check_mk forum to see if they can help. thanksBox293 wrote:The setting for service_Tomcat need to be defined specifically in the service definition.
For example (this is taken from a Nagios XI box but the settings are the same):You want to change these to the values you want.Code: Select all
define service { host_name 10.25.14.2 service_description CPU Usage use xiwizard_windowswmi_service check_command check_xi_service_wmiplus!'yyyyy'!'xxxxx'!checkcpu!-w '80' -c '90'!!!! max_check_attempts 5 check_interval 5 retry_interval 1 check_period xi_timeperiod_24x7 notification_interval 60 notification_period xi_timeperiod_24x7 contacts nagiosadmin notes_url http://notes.com _xiwizard windowswmi register 1 }
extra_service_conf appears to be a check_mk thing, which you will need to go to the check_mk support forums to get answers on this.Code: Select all
max_check_attempts 5 check_interval 5 retry_interval 1
Re: Help with service dependencies.
Thanks, keep us informed on the status.
Be sure to check out our Knowledgebase for helpful articles and solutions!
-
- Posts: 12
- Joined: Fri Feb 06, 2015 3:58 pm
Re: Help with service dependencies.
Hi, could you redirect me to the check_mk support forums? I took a look at the all the forums, and i am unsure which one to post in.tgriep wrote:Thanks, keep us informed on the status.
This task has been reduced to low priority since it is taking a lot of time, and it isn't essential, but i know i'm really close to getting this fixed as i spent many hours on this. Thanks
- Box293
- Too Basu
- Posts: 5126
- Joined: Sun Feb 07, 2010 10:55 pm
- Location: Deniliquin, Australia
- Contact:
Re: Help with service dependencies.
It looks like you're going to need to join a mailing list:
http://mathias-kettner.com/check_mk_lists.html
http://mathias-kettner.com/check_mk_lists.html
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
-
- Posts: 12
- Joined: Fri Feb 06, 2015 3:58 pm
Re: Help with service dependencies.
I got this figured out as seen with the config belowconfused_IT wrote:Box293 wrote:Your configuration is correct.
The problem lies in your service_Tomcat.
This service needs to enter a HARD state before the dependencies take affect.
For example if your service_Tomcat was:
check_interval 5
retry_interval 1
max_check_attempts 5
AND your other services were:
check_interval 1
retry_interval 1
max_check_attempts 5
Then it would take service_Tomcat up to 9 minutes before it enters a HARD state. During this time the other services will continue to execute.
I'm actually having a bit difficulty on where to make the change. Well, I think i do know where to make the change, but I think there is some conflict. I tried this:
extra_service_conf["max_check_attempts"] = [
( "5", ALL_HOSTS, ALL_SERVICES ),
( "3", ["server01"], ["service_Tomcat"] )
]
and while nagios successfully compiles nagios or what not, it doesn't actually update to '2' for server01
extra_service_conf["max_check_attempts"] = [
( "2", ["server01"], ["service_Tomcat"] ),
( "3", ALL_HOSTS, ALL_SERVICES )
]
extra_service_conf["normal_check_interval"] = [
( "1", ["server01"], [ "service_Tomcat"] ),
( "5", ALL_HOSTS, ALL_SERVICES )
]
extra_service_conf["retry_check_interval"] = [
( "1", ["server01"], ["service_Tomcat"] ),
( "2", ALL_HOSTS, ALL_SERVICES )
]
The mailing list was able to help me with that, and i can see the max check changed in the value in the web gui of nagios. however, it still seems that the services will still manage to hit the max retries with the config. I'm lost now because the config looks right to me. The mailing list no longer is helping, so I'm hoping someone hear may be able to point me in the right direction