Page 2 of 2
Re: Two unrelated questions: NRPE port and a plugin issue
Posted: Mon Dec 02, 2013 3:21 pm
by abrist
This plugin (bash script) uses the "service" command to pull status which you most likely will not find on a solaris box. In solaris 10 land, you may need to use:
In previous versions of solaris (8 & 9), you will need to check with the actual init scripts:
Re: Two unrelated questions: NRPE port and a plugin issue
Posted: Mon Dec 02, 2013 3:30 pm
by snapon_admin
Yeah, sorry I'm working on too many things at once here...
We got the service check to work for services, using the check_services plugin. We have a service called ctmagent7 that we're testing this on. There are also 2 processes associated with that service, p_ctmat and p_ctmag. We need a way to monitor those. We've been trying to get the check_procs plugin to work, but it only seems to partially do what we need it to.
If we run it on a Global server, it reports like this:
Code: Select all
PROCS OK: 4 processes with args 'p_ctmag'
The reason it shows 4 is because this process is running on the global and on the 3 local zones that are on that server. We would need a way to determine if this process is running on the global only. The script as is would work fine on zones, since only one process would be running on them, but as it is currently it would only send an alert for the global if the process stops running on the global AND all the local zones.
For reference here's a 'ps -ef | grep ctm' from the global:
Code: Select all
kenapps11g$ ps -ef | grep ctm
root 3854 1 0 Mar 11 ? 9:38 ./ctmagent7/ctm/exe/p_ctmat
root 3780 1 0 Mar 11 ? 11:47 ./ctmagent7/ctm/exe/p_ctmag
root 4231 1 0 Mar 11 ? 13:30 ./ctmagent7/ctm/exe/p_ctmag
root 4278 1 0 Mar 11 ? 16:58 ./ctmagent7/ctm/exe/p_ctmat
root 5000 1 0 Mar 11 ? 11:38 ./ctmagent7/ctm/exe/p_ctmag
root 5040 1 0 Mar 11 ? 9:21 ./ctmagent7/ctm/exe/p_ctmat
pr6449 10364 18655 0 14:30:16 pts/7 0:00 grep ctm
root 20489 1 0 Jun 18 ? 6:00 ./ctmagent7/ctm/exe/p_ctmat
root 20454 1 0 Jun 18 ? 7:34 ./ctmagent7/ctm/exe/p_ctmag
and from one of the local zones:
Code: Select all
kendblab01$ ps -ef | grep ctm
pr6449 10227 7390 0 14:29:48 pts/6 0:00 grep ctm
root 4231 2250 0 Mar 11 ? 13:30 ./ctmagent7/ctm/exe/p_ctmag
root 4278 2250 0 Mar 11 ? 16:58 ./ctmagent7/ctm/exe/p_ctmat
Re: Two unrelated questions: NRPE port and a plugin issue
Posted: Tue Dec 03, 2013 10:13 am
by abrist
May be someone else has an idea at this point. Other PID, there is nothing unique in those processes, thus making it difficult to identify which process is global from those that are in zones. Are there any other service related utilities that can be used to identify which zone a service is running under?
Re: Two unrelated questions: NRPE port and a plugin issue
Posted: Tue Dec 03, 2013 11:04 am
by snapon_admin
I think we may have figured this out on our own. We're still doing some testing to make sure it works in all cases, but one of our Unix admins put together a script that appears to do what we need it to. So, since that issue seems to be cleared up I have one more question about the listening port issue. If I'm understanding lmiltchev correctly, in order to change this I would need to change the /etc/xinetd.d/nrpe file on the Nagios server to a different port than the default 5666, and then also change the port on the nrpe.cfg file on all of our Unix hosts to whatever port we decide to use, correct? If that is correct, my other question is, would I be able to just add a port to the Nagios server's nrpe file so that the change doesn't generate alerts on every server that doesn't have the modified nrpe.cfg file yet? Something like:
Code: Select all
# default: on
# description: NRPE (Nagios Remote Plugin Executor)
service nrpe
{
flags = REUSE
socket_type = stream
port = 5666,5667
wait = no
user = nagios
group = nagios
server = /usr/local/nagios/bin/nrpe
server_args = -c /usr/local/nagios/etc/nrpe.cfg --inetd
log_on_failure += USERID
disable = no
only_from = 127.0.0.1
}
Where I have the old port (5666) and the new port (5667) both being checked, or would that not work? If that wouldn't work, what would you recommend to be the best way to change the listening port without generating hundreds of alerts for all of our Unix hosts.
Re: Two unrelated questions: NRPE port and a plugin issue
Posted: Tue Dec 03, 2013 11:23 am
by sreinhardt
I think that comma separated port list should work, although have not personally tested it. I would however highly suggest against using another common nagios port, 5667/nsca, and instead suggest something different. Unless of course you have good reasoning for using this, and knowing you, you very well may.
edit: You could also use iptables to nat it to another port if you wished to go that route.
Re: Two unrelated questions: NRPE port and a plugin issue
Posted: Tue Dec 03, 2013 12:38 pm
by snapon_admin
Nope, No good reason really. Just the next one in line that nothing here is likely using. Doing a quick search, it doesn't look like 5668 is doing anything for Nagios. Maybe we'll try that one instead. Thanks for the heads up on nsca, I actually knew that but had forgotten.
Re: Two unrelated questions: NRPE port and a plugin issue
Posted: Tue Dec 03, 2013 12:42 pm
by sreinhardt
You are correct, 5668 currently is not used to my knowledge. Give it a shot!
Re: Two unrelated questions: NRPE port and a plugin issue
Posted: Tue Dec 03, 2013 12:57 pm
by abrist
You do not need to to change the nrpe port on the nagios server unless you are checking the nagios server through nrpe from a different host. nrpe is a server/client application. The server runs on the remote host (nrpe daemon) while check_nrpe is the client that checks it (usually run from XI). You do still need to change the port on the remote systems though and you will need to alter the check_nrpe command to include a port switch (-p <port>).