Issue monitoring Oracle PMONs using 'check_procs'

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
rferebee
Posts: 733
Joined: Wed Jul 11, 2018 11:37 am

Issue monitoring Oracle PMONs using 'check_procs'

Post by rferebee »

Team,

I've been attempting to troubleshoot an issue I'm seeing related to monitoring Oracle PMONs with XI. We are trying to monitor 4 different active PMONs, but the names are very similar and XI appears to be adding a wildcard at the end of PMON name when it passes the argument.

For example, we have a PMON named ora_pmon_npasdevcdb and we use the command 'check_procs' in XI to monitor that it's active:

Code: Select all

$USER1$/check_nrpe -2 -H $HOSTADDRESS$ -t 30 -c check_active_procs -a "$ARG1$"
For some reason 'check_procs' uses 'check_nrpe' to invoke 'check_active_procs' in order to pass the same variables that 'check_procs' would use:

/usr/local/nagios/etc/nrpe.cfg

Code: Select all

command[check_active_procs]=/usr/local/nagios/libexec/check_procs -c $ARG1$ -a $ARG2$
Now the issue we're seeing is one of the DBAs has created a new PMON named ora_pmon_npasdevcdb19 and both PMONs need to be running on the host. But, for some reason, the service check for ora_pmon_npasdevcdb is seeing two PMONs running with that name even though one of them is appended with 19. It's almost as if XI is adding a wildcard to the end of the PMON name.

Code: Select all

[nagios@nagiosxiserver ~]$ /usr/local/nagios/libexec/check_nrpe -2 -H xx.xx.xx.xx -t 30 -c check_active_procs -a ""1:1 ora_pmon_npasdevcdb""
PROCS CRITICAL: 2 processes with args 'ora_pmon_npasdevcdb'
I need to know how to add a stop or a break to the end of the PMON name, so XI stops seeing ora_pmon_npasdevcdb19 as part of this check. It has it's own PMON check which works fine.

The service check will not allow me to use any of the switches in this document: https://nagios-plugins.org/doc/man/check_procs.html

It fails and says such and such variable needs to be an integer... which it is.

Thank you.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Issue monitoring Oracle PMONs using 'check_procs'

Post by tgriep »

The check_procs plugin was an exclude option.
-X, --exclude-process
Exclude processes which match this comma separated list
Try adding this to the command and see if that will exclude the ora_pmon_npasdevcdb19 process from the check.
Add this to the command.

Code: Select all

-X ora_pmon_npasdevcdb19
Be sure to check out our Knowledgebase for helpful articles and solutions!
rferebee
Posts: 733
Joined: Wed Jul 11, 2018 11:37 am

Re: Issue monitoring Oracle PMONs using 'check_procs'

Post by rferebee »

I tried that yesterday after updating NRPE in the remote host and creating a custom command in XI, but it's still "seeing" both processes:

Code: Select all

[nagios@nagiosxiserver ~]$ /usr/local/nagios/libexec/check_nrpe -2 -H xx.xx.xx.xx -t 30 -c check_procs -a 1 1: 'ora_pmon_npasdevcdb -X ora_pmon_npasdevcdb19'
PROCS WARNING: 2 processes with args 'ora_pmon_npasdevcdb', exclude progs 'ora_pmon_npasdevcdb19' | procs=2;1;1:;0;
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Issue monitoring Oracle PMONs using 'check_procs'

Post by tgriep »

One more thing to do is to use the ereg option and put in the full process output fore the option.
--ereg-argument-array=STRING
Only scan for processes with args that contain the regex STRING.
For example, if your process id running out of /usr/sbin you would specify it like this

Code: Select all

--ereg-argument-array=^/usr/sbin/ora_pmon_npasdevcdb$
It has to use the full PS output to work.

For example, to check for httpd on my system.

Code: Select all

ps -ef |grep http
root        1330       1  0 08:38 ?        00:00:00 /usr/sbin/httpd -DFOREGROUND
apache      1374    1330  0 08:38 ?        00:00:00 /usr/sbin/httpd -DFOREGROUND
apache      1379    1330  0 08:38 ?        00:00:00 /usr/sbin/httpd -DFOREGROUND
apache      1393    1330  0 08:38 ?        00:00:00 /usr/sbin/httpd -DFOREGROUND
apache      1396    1330  0 08:38 ?        00:00:00 /usr/sbin/httpd -DFOREGROUND
apache      2301    1330  0 08:40 ?        00:00:00 /usr/sbin/httpd -DFOREGROUND
I used this command

Code: Select all

/usr/local/nagios/libexec/check_procs --ereg-argument-array='^/usr/sbin/httpd -DFOREGROUND$'
PROCS OK: 6 processes with regex args '^/usr/sbin/httpd -DFOREGROUND$' | procs=6;;;0;
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked