Re-check with another nagios server if critical ?

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
Mykeul
Posts: 5
Joined: Sun Oct 30, 2011 11:41 am

Re-check with another nagios server if critical ?

Post by Mykeul »

Hello,

I have a central nagios server, with 3 gearman workers.
When a check is "critical", I would like to check a second time with another worker (or nagios), to be sure it is not a local problem.

Is this possible ?

Thanks for your help
Mykeul
Posts: 5
Joined: Sun Oct 30, 2011 11:41 am

Re: Re-check with another nagios server if critical ?

Post by Mykeul »

Hello,

I am amazed no one seems to do that ... it is important, isn't it ?
On a world distributed architecture, we can loose link bt the monitored host is up for other people on earth.

Please help :)

M
User avatar
jsmurphy
Posts: 989
Joined: Wed Aug 18, 2010 9:46 pm

Re: Re-check with another nagios server if critical ?

Post by jsmurphy »

I think the problem is you need to further explain your setup... I don't know what gearman is, how you have set it up or what your requirements are.

Taking a wild stab in the dark, you could set up a different check for each of the workers and then use service dependencies http://nagios.sourceforge.net/docs/3_0/ ... dependency to tell it to suppress alerts when the other checks are ok. Hopefully this helps :)
Mykeul
Posts: 5
Joined: Sun Oct 30, 2011 11:41 am

Re: Re-check with another nagios server if critical ?

Post by Mykeul »

Hello,

Thanks for replying, I did not realize that my question was not enough explained and you are right.
In fact, forget the gearman workers, those are only nagios pollers.

To simplify, lets say I have 1 nagios in China and 1 in USA. I monitoring a server located in France.
Due to worldwide network latencies/problems, sometimes, the China nagios says the France is Down, but the USA nagios says it is OK. The reality is that the France server is OK

So, I would like the China server (or a master, or whatever) to ask the USA server (or a master, or whatever) to check the France server, and then change the state to critical (and notify) only when the 2 nagios say it is down.

I dont mind to change the plugins/nebs/scripts to fit the need, it is important.

Thanks for your help

Mykeul
User avatar
jsmurphy
Posts: 989
Joined: Wed Aug 18, 2010 9:46 pm

Re: Re-check with another nagios server if critical ?

Post by jsmurphy »

I think my preferred solution in this instance would be just to improve the fault tolerance, require that they do more retry checks or take longer between retry checks. But that really depends on how fast you need to react if there is a problem.

There is no easy way of accomplishing what you want, there isn't even a good way that I know of. If you really wanted to do this you could potentially jury rig a solution using NSCA, a passive service and event handlers... but you would probably be over-engineering a solution that may not really need it.
Locked