That's what's confusing me. The process isn't down at all. Is there another option to check processes that might not utilize UDP connections?scottwilkerson wrote:Everything looks correct, but as this is a SNMP check it is utilizing UDP connections, and as this is stateless, packets can get dropped. this is likely what is happening.
With config you posted though, it shouldn't be sending notifications if it is only down for 30 seconds, it should be trying 5 times at 1 minute intervals before sending notification
check_snmp_process_wizard.pl lag?
Re: check_snmp_process_wizard.pl lag?
-
slansing
- Posts: 7698
- Joined: Mon Apr 23, 2012 4:28 pm
- Location: Travelling through time and space...
Re: check_snmp_process_wizard.pl lag?
Yes, you could call the check_procs plugin through NRPE, for example:
http://nagiosplugins.org/man/check_procs
http://linuxsysadminblog.com/2009/02/na ... s-running/
http://nagiosplugins.org/man/check_procs
http://linuxsysadminblog.com/2009/02/na ... s-running/
Re: check_snmp_process_wizard.pl lag?
I'm wondering what the difference in Nagios server load would be between these two? After making the change for one of the process checks, I've noticed that we are no longer getting false alarms for the process being critical.
Am I right in thinking that it would take some of the load off of the server in running this (and other) checks via NRPE as opposed to SNMP?
If I have a number of proicesses that are unique to our set-up that I am checking via SNMP currently, would I notice a decrease in load on the nagios server if I moved those checks to NRPE?
Also, can you check more than one process with a single check using NRPE or do I need to have a single check for each of the different processes that I want to check?
Am I right in thinking that it would take some of the load off of the server in running this (and other) checks via NRPE as opposed to SNMP?
If I have a number of proicesses that are unique to our set-up that I am checking via SNMP currently, would I notice a decrease in load on the nagios server if I moved those checks to NRPE?
Also, can you check more than one process with a single check using NRPE or do I need to have a single check for each of the different processes that I want to check?
Re: check_snmp_process_wizard.pl lag?
That is most likely because NRPE uses TCP instead of UDP and as Scott stated, you may be experiencing dropped packets with UDP.jbennett wrote:I'm wondering what the difference in Nagios server load would be between these two? After making the change for one of the process checks, I've noticed that we are no longer getting false alarms for the process being critical.
NRPE may use less load than snmp, but not much less.jbennett wrote:Am I right in thinking that it would take some of the load off of the server in running this (and other) checks via NRPE as opposed to SNMP?
Negligible, though on a large enough scale it may be noticeable.jbennett wrote: If I have a number of proicesses that are unique to our set-up that I am checking via SNMP currently, would I notice a decrease in load on the nagios server if I moved those checks to NRPE?
You could do it either way, although if you set them all up on one check, the entire check will fail when any of the parts fail. If you want each service check to be specific to each process checked so you get granular warnings/criticals, you will need separate checks. If you don't mind monolithic alerts, you could wrap up the whole lot of the process checks into a single service check script.jbennett wrote: Also, can you check more than one process with a single check using NRPE or do I need to have a single check for each of the different processes that I want to check?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Re: check_snmp_process_wizard.pl lag?
I wouldn't mind that since any of these services failing would warrant attention.abrist wrote:You could do it either way, although if you set them all up on one check, the entire check will fail when any of the parts fail. If you want each service check to be specific to each process checked so you get granular warnings/criticals, you will need separate checks. If you don't mind monolithic alerts, you could wrap up the whole lot of the process checks into a single service check script.
When it did fail, would it spit back the process that failed or just that something in the string of processes to check failed?
I suppose I would just have the command as follows on the box I'm checking?
Code: Select all
command[check_procs]=/usr/local/nagios/libexec/check_procs -c 1:1 -C proc1 proc2 proc3Re: check_snmp_process_wizard.pl lag?
I do not think check_procs supports more than 1 specified process for the "-C" (command) switch. You would have to script a custom solution or check out the exchange: http://exchange.nagios.org/index.php?op ... word=procs
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Re: check_snmp_process_wizard.pl lag?
I have found the following: http://exchange.nagios.org/directory/Un ... cs/details
I'm running into an issue though and I'm not sure where to go for help. It says that the owner is nagiosexchange.
Basically, I'm getting the following:
If I remove the -u root switch, the check doesn't work at all (throws back usage directions).
I'm running into an issue though and I'm not sure where to go for help. It says that the owner is nagiosexchange.
Basically, I'm getting the following:
Code: Select all
./check_multi_procs.pl -b /usr/local/nagios/libexec/check_procs -u root -f proc1; proc2
Multiple process check failed on :
PROCS CRITICAL: 2 processes with UID = 0 (root), args 'proc1'
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: check_snmp_process_wizard.pl lag?
I'm not totally familiar with this plugin but I am sure you will need to either escape the ; between the procs
or quote them
This particular process does appear to require you to pass in the owner of your processes you are checking
Code: Select all
./check_multi_procs.pl -b /usr/local/nagios/libexec/check_procs -u root -f proc1\; proc2Code: Select all
./check_multi_procs.pl -b /usr/local/nagios/libexec/check_procs -u root -f 'proc1;proc2'Re: check_snmp_process_wizard.pl lag?
I've tried your suggestions in a few different ways without any luck.
The readme has the following:
While the help information shows the following:
It seems that this would help alleviate some load on the server as I have about 400 boxes that I need to check 5 processes on each (2000 checks total). If I am able to implement this check instead, it would lower that to only 400 checks.
If any one of these processes goes awry it is considered a critical, so a blanket notification is just fine here.
Am I correct in thinking that that going this route instead of individual service checks for each process would help lower the load?
We are looking to add a number of other services checks in the near future on these boxes. We are looking at a potential for about 25-30 more service checks per box. If I can work to streamline the current checks, it would help us with load in the future.
The readme has the following:
Code: Select all
# Config Example :
# check_command check_nrpe!check_multi_procs!user!"proc1:proc2:proc3" Code: Select all
# ./check_multi_procs.pl -b /usr/local/nagios/libexec/check_procs -u root -f lane; ves_ocr
Multiple process check failed on :
PROCS CRITICAL: 2 processes with UID = 0 (root), args 'lane'
# ./check_multi_procs.pl -b /usr/local/nagios/libexec/check_procs -f proc1;proc2
Usage : ./check_multi_procs.pl -f filer -u user [-b check_proc_bin] [-s min_proc] [-x max_proc] [-h]
-u : Set user owner of process
-f : Give process to check (must be string and could be separated by ';')
-b : Specify the check_proc Nagios plugin binary
-s : Set min process to be available (default 1)
-x : Set max process to be available (default 1)
-h : Print this help message
If any one of these processes goes awry it is considered a critical, so a blanket notification is just fine here.
Am I correct in thinking that that going this route instead of individual service checks for each process would help lower the load?
We are looking to add a number of other services checks in the near future on these boxes. We are looking at a potential for about 25-30 more service checks per box. If I can work to streamline the current checks, it would help us with load in the future.
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: check_snmp_process_wizard.pl lag?
I looked at the code and it is supposed to be split by : NOT ; (really bad help file...)
lets try
Here's a video
http://library.nagios.com/library/produ ... s-tutorial
lets try
Code: Select all
/check_multi_procs.pl -b /usr/local/nagios/libexec/check_procs -u root -f lane:ves_ocrCertainly would on the XI server.. Another possibility would be to use NRDS and setup the checks individually (Available under Admin -> NRDS Config Manager)jbennett wrote:Am I correct in thinking that that going this route instead of individual service checks for each process would help lower the load?
Here's a video
http://library.nagios.com/library/produ ... s-tutorial