Page 1 of 1

Adding a new service check without restarting Nagios

Posted: Wed Mar 01, 2017 7:35 am
by rtoma
Hi,

We have a large Nagios3 + Puppet setup. We use a python script that queries Puppetdb for Nagios resources and generates Nagios config files. A few times a day we restart Nagios to pick up the new files. Since we have ~100k service checks, Nagios goes blind for 1 a 2 minutes during a restart. That's bad :(

We are using Docker / Mesos and lots of dynamic stuff. Which we want to start monitoring as it happens. So we want to be able to change the Nagios config once a minute or preferred at realtime. It seemed to me Nagios3 will not cut it. People have advised us to migrate to Nagios4 or other monitoring solutions. That's probably what we will do, but I wanted to try something different.

So I have created a NEB that adds a new service check without requiring a restart.

The proof-of-concept code is here: https://gist.github.com/rtoma/3fb1464de ... 5d9e3c0ad5

Is this something other Nagios developers would like to improve upon? It would be great to have a NEB module offering a HTTP API to add hosts / services at runtime.

I am wondering why no one (afaik) has tried something similar.

Regards,
Renzo

Re: Adding a new service check without restarting Nagios

Posted: Wed Mar 01, 2017 4:32 pm
by mcapra
That's a very cool setup :)

If you'd like to see this sort stuff make it into Nagios Core, you can always raise an issue on the github with your findings and suggestions. If this requires mklivestatus to function, it might be more appropriate to share your findings with that community:
https://mathias-kettner.de/checkmk_livestatus.html

It's a very cool thing, but mklivestatus and thruk is a very different beast from regular old Nagios Core. Having the sort of functionality required to add/remove objects without reloading the entire configuration set is something that's been explored quite a bit though.