Adding a new service check without restarting Nagios

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
rtoma
Posts: 1
Joined: Tue Feb 28, 2017 3:35 am

Adding a new service check without restarting Nagios

Post by rtoma »

Hi,

We have a large Nagios3 + Puppet setup. We use a python script that queries Puppetdb for Nagios resources and generates Nagios config files. A few times a day we restart Nagios to pick up the new files. Since we have ~100k service checks, Nagios goes blind for 1 a 2 minutes during a restart. That's bad :(

We are using Docker / Mesos and lots of dynamic stuff. Which we want to start monitoring as it happens. So we want to be able to change the Nagios config once a minute or preferred at realtime. It seemed to me Nagios3 will not cut it. People have advised us to migrate to Nagios4 or other monitoring solutions. That's probably what we will do, but I wanted to try something different.

So I have created a NEB that adds a new service check without requiring a restart.

The proof-of-concept code is here: https://gist.github.com/rtoma/3fb1464de ... 5d9e3c0ad5

Is this something other Nagios developers would like to improve upon? It would be great to have a NEB module offering a HTTP API to add hosts / services at runtime.

I am wondering why no one (afaik) has tried something similar.

Regards,
Renzo
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: Adding a new service check without restarting Nagios

Post by mcapra »

That's a very cool setup :)

If you'd like to see this sort stuff make it into Nagios Core, you can always raise an issue on the github with your findings and suggestions. If this requires mklivestatus to function, it might be more appropriate to share your findings with that community:
https://mathias-kettner.de/checkmk_livestatus.html

It's a very cool thing, but mklivestatus and thruk is a very different beast from regular old Nagios Core. Having the sort of functionality required to add/remove objects without reloading the entire configuration set is something that's been explored quite a bit though.
Former Nagios employee
https://www.mcapra.com/
Locked