negate no longer works after I update Centos 6
Re: negate no longer works after I update Centos 6
No, this is not resolved. I'm not sure how it could be since it worked prior to my updating CentOS. So maybe it was broken before and now works, in your eyes, it doesn't give me the indication in XI that I had before, so I need to understand how to resolve that now.
Re: negate no longer works after I update Centos 6
What indication did you have in XI previously that has been removed?it doesn't give me the indication in XI that I had before, so I need to understand how to resolve that now.
To clear a few things up, when you run the following:
Code: Select all
./check_nrpe -H xxxx.xxx -c check_services -a "e2fsck"e2fsck: 1
Does the 'e2fsck' process actually exist on the remote host you're monitoring? I think this is where the confusion is stemming from. If that process does not exist, yet check_nrpe reports that it does, you should verify that it doesn't exist by running the following on your *remote* server:
Code: Select all
ps -ef | grep e2fsckRe: negate no longer works after I update Centos 6
Can you show us what you are seeing in the web UI at the moment and what think you need to see instead? Are you getting the "expected" output if you don't use the "negate" plugin?...it doesn't give me the indication in XI that I had before...
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: negate no longer works after I update Centos 6
OK, I had this setup so that IF I WAS running fsck then I would get a red alert. If I WAS NOT running fsck (most of the time) then I would get a green alert. And this command has worked perfectly, until I updated to CentOS 6.7 a few days ago. Now, as you can see, I get a constant red alert even though fsck is not running. I did not change my nagios server at all. It was just a result of the update. And I was really just alerting your team to that fact, but I am curious as to what changed, and would like to find a way back to a valid check.
You do not have the required permissions to view the files attached to this post.
Re: negate no longer works after I update Centos 6
You output appears to indicate that fsck is running on the remote server in question - e2fsck: 1. Are you certain that it's not?
On the remote host:
On the remote host:
Code: Select all
ps -ef | grep fsck-
jdalrymple
- Skynet Drone
- Posts: 2620
- Joined: Wed Feb 11, 2015 1:56 pm
Re: negate no longer works after I update Centos 6
Let's start from the beginning and work it through:
From your first post:
This is where I start to think I don't know what check_services is nor what it's supposed to do. I would recommend using check_procs which probably has a more predictable output and we know it adheres to threshold guidelines perfectly.
If I understand your use case negate would become unnecessary, even though it is doing just fine in all of your examples.
From your first post:
It works best if you copy and paste, although assuming you just fat fingered and left out some bits - the output is still exactly as you'd expect. Negate by default will not munge the output of the plugin, only the return code.tecnalb wrote:Code: Select all
[root@nagios libexec]# ./check_nrpe -H xxxx.xxx -t 30 -c check_services -a "e2fsck" e2fsck: 1 [root@nagios libexec]# ./negate ./check_nrpe -H xxxx.xxx-a "e2fsck" e2fsck: 1
You're implying that negate on the Nagios server was changed by updating a remote nrpe host?tecnalb wrote:No, I updated the target host only. However it seems my negate and your negate are vastly different.
This negate is doing EXACTLY what it's supposed to - and all it's supposed to. The commands here are screwy, but negate is indeed working.tecnalb wrote:[root@nagios libexec]# ./negate ./check_nrpe -H xxx.xxx -a "e2fsck"
NRPE v2.14
[root@nagios libexec]# echo $?
2
Without negate
[root@nagios libexec]# ./check_nrpe -H xxx.xxx -a "e2fsck"
NRPE v2.14
[root@nagios libexec]# echo $?
0
Again, negate is working perfectly. In addition nrpe and check_nrpe are also doing their job well. If negate was working differently than this I would say negate was broken.tecnalb wrote:[root@nagios libexec]# ./check_nrpe -H xxxx.xxx -c check_services -a "e2fsck"
e2fsck: 1
[root@nagios libexec]# echo $?
0
[root@nagios libexec]# ./negate ./check_nrpe -H xxxx.xxx -c check_services -a "e2fsck"
e2fsck: 1
[root@nagios libexec]# echo $?
2
On the host:
[nagios@backup libexec]$ /usr/local/nagios/libexec/check_services -p e2fsck
e2fsck: 1
[nagios@backup libexec]$ echo $?
0
This is where I start to think I don't know what check_services is nor what it's supposed to do. I would recommend using check_procs which probably has a more predictable output and we know it adheres to threshold guidelines perfectly.
Code: Select all
[jdalrymple@localhost libexec]$ ./check_procs -a fsck -c :1
PROCS OK: 0 processes with args 'fsck' | procs=0;;:1;0;