Page 1 of 2

Distributed monitoring

Posted: Wed May 21, 2014 11:04 am
by rentsys
I have three nagios servers all with the 2014 update, and I was wondering how to have two of the nagios servers monitor there local environments and then send all the passive checks to the third nagios server. I tried to use NRDP, but if one of the nagios servers sends a passive check that is warning status it treats it as a critical status and notifies contacts. The passive check doesn't go through the max check attempts. Is there a better way to set up this system?

Re: Distributed monitoring

Posted: Wed May 21, 2014 11:27 am
by slansing
Well the main server would not actually be running any checks and I'd recommend disabling notifications on it and using the two stand alone servers to notify, that way you won't get the false notifications problem. You can either use NRDP or NSCA to send up to the central server, I suggest trying both and seeing what you prefer, though they will be virtually identical as far as I know.

http://assets.nagios.com/downloads/nagi ... ith_XI.pdf
http://assets.nagios.com/downloads/nagi ... ith_XI.pdf

Re: Distributed monitoring

Posted: Wed May 21, 2014 11:56 am
by rentsys
Sorry, I forgot to mention that the central nagios server, that the other two nagios servers are sending passive checks to, actively monitors it's own environment. The real problem I am facing is getting passive checks to go through the max check interval, rather than just notifying immediately.

*edit

Would I be able to use DNX to set this up? Can I assign hosts and checks to a specific server with DNX? What I am trying to accomplish would work if that was possible.

Re: Distributed monitoring

Posted: Thu May 22, 2014 10:50 am
by tmcdonald
Passive host results are treated as HARD states by default, thereby ignoring the max_check_attempts value. Setting "passive_host_checks_are_soft=1" in the nagios config will make them be treated as SOFT states and they should respect the max_check_attempts. Passive service checks I believe are always SOFT.

Re: Distributed monitoring

Posted: Thu May 22, 2014 10:57 am
by rentsys
I have that enabled already.What if that is only for Host checks? Does it treat Service checks as soft?

Re: Distributed monitoring

Posted: Thu May 22, 2014 11:05 am
by lmiltchev
On the DNX - here's our documentation:

http://assets.nagios.com/downloads/nagi ... ng_DNX.pdf

To be honest with you - it's been a while since I set up DNX. I am not sure if it will do the job for you. This project hasn't been maintained for a while and it's somewhat outdated. You can give it a try though - it's pretty easy to set it up. If this doesn't work as expected, you may try Mod Gearman. Hope this helps.

Re: Distributed monitoring

Posted: Wed May 28, 2014 12:44 pm
by rentsys
I read the description for Mod Gearman and it seamed to be more of what I wanted to implement. I went to https://github.com/sni/mod_gearman/tree/nagios4 because I have the 2014 production release, but when I run into a problem at the "Build/install Gearmand rpms". when I try to install boost141-devel it says it errors out.

Code: Select all

--> Finished Dependency Resolution
Error: Package: boost141-graph-1.41.0-2.el5.x86_64 (puias-computational)
           Requires: libicuuc.so.36()(64bit)
Error: Package: boost141-regex-1.41.0-2.el5.x86_64 (puias-computational)
           Requires: libicuuc.so.36()(64bit)
Error: Package: boost141-graph-1.41.0-2.el5.x86_64 (puias-computational)
           Requires: libicui18n.so.36()(64bit)
Error: Package: boost141-regex-1.41.0-2.el5.x86_64 (puias-computational)
           Requires: libicui18n.so.36()(64bit)
I checked where libicuuc.so.36 is supposed to be and it is a newer version of that file libicuuc.so.42. What should I do from here?

Re: Distributed monitoring

Posted: Wed May 28, 2014 12:51 pm
by slansing
I strongly suggest you use our documentation as they may have changed theirs in recent months:

http://support.nagios.com/forum/viewtop ... n&start=10

Try the script I attached on the second page.

Note: That is for XI 2014 / Core 4 specifically, if you are using XI 2012 / Core 3 please use the documentation listed on our library.

Re: Distributed monitoring

Posted: Wed May 28, 2014 12:56 pm
by rentsys
I can't see that page. Can you attach that script here?
*edit
I was able to find that page by digging through google cache http://webcache.googleusercontent.com/s ... clnk&gl=us, but i guessing google doesn't have the permissions download those files. So I still need your help with getting that script.
*edit
I think I don't have permission because the link goes to the customer forum.

Re: Distributed monitoring

Posted: Thu May 29, 2014 10:16 am
by sreinhardt
Here you go, along with some notes from slansing:

Currently, until we find a better way, you will also need to update your performance data commands in the CCM in order to get perf data pushed from the workers up to XI and displayed properly.

You will need to change-

process-host-perfdata-file-bulk and process-service-perfdata-file-bulk's command's to:

Code: Select all

    sed -i 's/\\n//g' /usr/local/nagios/var/host-perfdata && /bin/mv /usr/local/nagios/var/host-perfdata /usr/local/nagios/var/spool/xidpe/$TIMET$.perfdata.host
And:

Code: Select all

    sed -i 's/\\n//g' /usr/local/nagios/var/service-perfdata && /bin/mv /usr/local/nagios/var/service-perfdata /usr/local/nagios/var/spool/xidpe/$TIMET$.perfdata.service