I'm using Nagios for monitoring some VMware vCenter Servers in a production environment using this perl plugin http://www.op5.org/community/plugin-inv ... esx-plugin. The checks (CPU, MEM, NET, RUNTIME, IO) work fine for a couple of hosts but once I begin monitoring too many servers I start having problems with the checks. The plugin just generates too many connections, and to defeat that I think I need connection persistance between checks.
From what I've gathered, the actual problem is that even when the SDK connection routines (that the plugin use) were made to look for open connections before opening new ones, it will obviously just look for those created during runtine. And as we all know, Nagios fork()s everytime (two times by default) it runs an active check, so finally the plugin just creates a connection for every check and for every host.
From what I've read I have these options:
- Transforming the plugin into a daemon and making these checks passive (for now it looks like my first option)
- Modifying the plugin, and using some stuff from CPAN to achieve IPC (likely to work, but dirty as hell)
- Managing check scheduling in such a way that this situation never happens (like setting max_concurrent_checks to 1, but I fear that would slow everyting)
Thanks!