Enterprise Monitoring Package

grenley · Post by **grenley** » Wed May 14, 2014 6:00 pm

Hi.

I am a total newbie.
We are doing a small XI trial to pursue replacing a large competing monitoring product with Nagios XI.
Just Linux for now (other platforms later).
I am trying to create a package of monitors that will be automatically connected to each Linux system as it becomes active in Nagios.
Things like processes (sshd, crond, etc.), filesystems (/var, /usr, etc.), logs (like /var/log/messages).
All System-specific stuff. Not one-off app stuff (which I expect we may use NagiosQL).
I'm thinking of making a static cfg file for each type of monitor, but don't know how to hook it up to a host group in CCM (if that's even the right approach).
Are Service Groups a better idea?
I've been poring over docs, websites and videos for days and still don't really have a good sense of the best way to approach this.
We could end up with tens of thousands of servers (Nagios agents), so my approach has to be pretty automatic.

Thanks very much.
Rick

sreinhardt · Post by **sreinhardt** » Thu May 15, 2014 11:37 am

Hey there! It sounds like you should take a peek at Activision's talk on dynamic monitoring of EC2 instances. While your exact situation may not be in the cloud, it certainly sounds like you are looking at about the same scale, and some of the ways that they went about solving this issue should apply very directly. Otherwise, do you have a preference of agent vs agentless monitoring, types of checks you would like dynamically added? The more details you can provide the more we can attempt to help!

http://www.youtube.com/watch?v=g1eiqTte1cQ

grenley · Post by **grenley** » Thu May 15, 2014 10:58 pm

Thanks very much Spenser.
I've now watched Todd's presentation a couple of times.
Lots of great architectural ideas but, for the moment, beyond my scope.
Also, his environment is far more dynamic (apparently for load-balancing) than mine.
I've got a couple of weeks to demonstrate how awesome Nagios is and how it can replace Brand X on tons of servers.
Here's the overall short-term scope:

One XI server (centos for now; rhel later).
A small handful of linux nrpe agents to run filesystem and process checks from the XI server.
Also nrpd/s to run passive logfile monitoring on each of these agents and forward results (including text of matching messages) up to XI server.

What I envision (and hope is do-able) is to create a config file of services on the XI server that would be generic and would be picked up by any linux agent that connects to to the Server.
This service config would be static--just changed on a quarterly release basis.
It would contain all the system (i.e., non-app) monitoring, from my previous post, that we would want across the linux enterprise.
Eventually, I would do the same for Solaris, AIX, HPUX and Windows.

Then I would want these agents to use a standard log monitoring configuration (home-grown tool) and use something like the nrdp_wrapper script I've found online to send alerts up to XI server.

So, my dilemma is two-fold:
1) I'd like comments about the reasonableness of this design
2) If reasonable, I'm having a heck of a time figuring out even the bare mechanics of how I'd implement it.

e.g., Service Groups or static files?
do they connect to a generic Linux-Server Host Group that each new linux box would then automatically pick up?

I'm playing around with CCM and I'm not having luck trying to make anything like this happen.
Is CCM really not the tool for this level of monitoring , but more for self-serve monitoring by app folks?
I see that all the cfg files say to not edit manually as they will be overwritten by NagiosQL.
How do I get around that?
This is just my first level of trying to figure this stuff out.
I know I'm asking a lot, but I think if I can get past this first learning curve part, I'd be rocking.

As I said, I've read docs, watched videos, looked at web tutorials and either I'm really obtuse (quite possible) or the learning curve here is bigger than I thought.
I'm struggling with just the basics of setting up the monitoring.

Thanks in advance for any insight you can provide.

Rick

tmcdonald · Post by **tmcdonald** » Fri May 16, 2014 2:01 pm

I don't have anything nearly close to a complete implementation plan for you, but I have some elements that might help:

Take a look at Admin -> NRDS Config Manager. NRDS lets you send passive results to a NRDP service running with Nagios. To quote the documentation:

The NRDS client configuration can be managed centrally by the NRDP server, or more easily through the NRDS Config Manager Component in Nagios XI. Once a configuration is made it can be shared by many clients and updates to the configuration on the NRDP server are automatically picked up by all clients using that configuration.

The NRDS client runs on a cron job at an interval specified by the administrator at install time. Each time the NRDS client runs, it will run all of the commands specified in the config file and send the results back to the Nagios server.

Additionally, it will check to see if there is a newer version of the configuration file it is using and if so, it will download the configuration file as well as all of the plugins it needs from the server and install them on the client.
http://assets.nagios.com/downloads/nagi ... h_NRDS.pdf
The CCM is good for manually crafting checks, but for most things the Wizards are going to be faster. You can always create something with a wizard and then edit it in the CCM later.
Editing the config files on the command line can be done, but as soon as you make a change in the CCM and apply it, these changes will be overwritten.
If you want to have files configured on the command line that are not overwritten in the CCM, you can place them in the /usr/local/nagios/etc/static/ folder.

I know this doesn't answer everything, but hopefully some of it is relevant.

Nagios Support Forum

Enterprise Monitoring Package

Enterprise Monitoring Package

Re: Enterprise Monitoring Package

Re: Enterprise Monitoring Package

Re: Enterprise Monitoring Package