Monitoring Solaris 10 and OpenSolaris
Monitoring Solaris 10 and OpenSolaris
So I want to monitor some solaris boxes (x86 and SPARC). Where I can get a NRPE client officialy supported by Nagios?
Re: Monitoring Solaris 10 and OpenSolaris
At this point in time I don't believe we have an officially supported agent for Solaris systems. There are some resources on exchange.nagios.org that are used for Solaris systems, but my guess is that these are used as some form of passive check. I think one of our techs did some testing with Solaris, but he's not in today, I'll see if he can point you to some more specific resources for it.
Re: Monitoring Solaris 10 and OpenSolaris
Ok, thanks man.
-
griffithusg
- Posts: 64
- Joined: Sun Nov 07, 2010 7:16 pm
Re: Monitoring Solaris 10 and OpenSolaris
Hi,
I would be interested in working with someone to create/collate a default set of Solaris NRPE check commands that could be made into a wizard for use inside of NagiosXi
But an officially supported NRPE would be awesome.
Keen to know whats happening with this.
Rob
I would be interested in working with someone to create/collate a default set of Solaris NRPE check commands that could be made into a wizard for use inside of NagiosXi
But an officially supported NRPE would be awesome.
Keen to know whats happening with this.
Rob
-
tonyyarusso
- Posts: 1128
- Joined: Wed Mar 03, 2010 12:38 pm
- Location: St. Paul, MN, USA
- Contact:
Re: Monitoring Solaris 10 and OpenSolaris
I don't know if we'll have an officially supported NRPE, but I am currently exploring ways to do at least a basic set of checks on a wide range of POSIX systems without NRPE at all, so that should help.
-
griffithusg
- Posts: 64
- Joined: Sun Nov 07, 2010 7:16 pm
Re: Monitoring Solaris 10 and OpenSolaris
That sounds pretty cool. Any idea of timeframe on that tony?
Rob
Rob
-
tonyyarusso
- Posts: 1128
- Joined: Wed Mar 03, 2010 12:38 pm
- Location: St. Paul, MN, USA
- Contact:
Re: Monitoring Solaris 10 and OpenSolaris
That probably depends on how complicated I make it. 
It would be useful to know what is of the most interest to people who actually use this, so perhaps you can give me some feedback and guidance. The driving problem is that NRPE won't compile on all systems (or at least we have no idea how to write instructions for doing so since we aren't familiar with them), and a lot of people aren't comfortable installing very much extra on production systems anyway. So, my goal is to be able to do things with the fewest dependencies possible, and in a way that will work on Solaris, OpenBSD, FreeBSD, AIX, HP-UX, and Linux.
As for the checks themselves, here is my current "to do" list - feel free to prioritize and/or add to it:
Once I have the check information, there of course has to be a way of communicating that to the Nagios server. So far I see the following ways of doing this, and may end up supporting more than one - please indicate what would be most appropriate for the environments you've dealt with. The primary differences are what they require to be installed on the clients, and whether the timing of the checks can be adjusted within Nagios or if it's set by cron.
It would be useful to know what is of the most interest to people who actually use this, so perhaps you can give me some feedback and guidance. The driving problem is that NRPE won't compile on all systems (or at least we have no idea how to write instructions for doing so since we aren't familiar with them), and a lot of people aren't comfortable installing very much extra on production systems anyway. So, my goal is to be able to do things with the fewest dependencies possible, and in a way that will work on Solaris, OpenBSD, FreeBSD, AIX, HP-UX, and Linux.
As for the checks themselves, here is my current "to do" list - feel free to prioritize and/or add to it:
- CPU usage percentage
- Memory usage
- Swap usage
- Disk space
- Disk transfer performance
Once I have the check information, there of course has to be a way of communicating that to the Nagios server. So far I see the following ways of doing this, and may end up supporting more than one - please indicate what would be most appropriate for the environments you've dealt with. The primary differences are what they require to be installed on the clients, and whether the timing of the checks can be adjusted within Nagios or if it's set by cron.
- Have Nagios use SSH to contact the client and request the information (sshd running on the client) (client requires ssh server)
- Have the client run the checks from cron and submit them over SSH to Nagios (sshd running on the Nagios server) (client requires ssh client)
- Have the client run the checks from cron and submit them over HTTP using wget or curl (client requires wget or curl)
- Have the client run the checks from cron, build a web page with the results, and dump that in a publicly accessible directory which is checked periodically by Nagios (client requires http server)
- Have Nagios make a dummy connection of some sort to the client, such as a ping request, and cron-run log watcher take that as a trigger to run checks, returning them in any of the previous 3 methods (client requires some sort of connection to be allowed and logged and one of the above)
Re: Monitoring Solaris 10 and OpenSolaris
That's really nice, man! 
-
griffithusg
- Posts: 64
- Joined: Sun Nov 07, 2010 7:16 pm
Re: Monitoring Solaris 10 and OpenSolaris
Hi Guys,
The only thing I would say is that we have run into issues with large numbers of ssh connections to solaris hosts as they tend to take 5 seconds or more sometimes depending on the system.
The only thing I would say is that we have run into issues with large numbers of ssh connections to solaris hosts as they tend to take 5 seconds or more sometimes depending on the system.
-
tonyyarusso
- Posts: 1128
- Joined: Wed Mar 03, 2010 12:38 pm
- Location: St. Paul, MN, USA
- Contact:
Re: Monitoring Solaris 10 and OpenSolaris
Yeah, I've had that with CentOS too. The thinking was that if it uses SSH, it would all be done through one SSH connection, passing a list of things to check, having all of those run, and a full report of all the results passed back at once.