Yes, let's assume that you have a check using check_tcp on port 80 with 127.0.0.1 as an easy reference.
Code: Select all
[root@localhost libexec]# ./check_tcp -H 127.0.0.1 -p 80
TCP OK - 0.000 second response time on 127.0.0.1 port 80|time=0.000199s;;;0.000000;10.000000
[root@localhost libexec]# echo $?
0
Everything is OK, and the echo $? reports back a 0 (which Nagios takes as an OK exit code)
Now, lets shut down apache and see what happens.
Code: Select all
[root@localhost libexec]# service httpd stop
Stopping httpd: [ OK ]
[root@localhost libexec]# ./check_tcp -H 127.0.0.1 -p 80
connect to address 127.0.0.1 and port 80: Connection refused
[root@localhost libexec]# echo $?
2
The echo $? reports back a 2, which means this would show as a CRITICAL status in Nagios.
Now, we just need to use the negate plugin to switch around these options, so that OK -> CRITICAL, and CRITICAL -> OK. This is due to the backwardness of the check.
Code: Select all
[root@localhost libexec]# ./negate --ok=CRITICAL --critical=OK ./check_tcp -H 127.0.0.1 -p 80
connect to address 127.0.0.1 and port 80: Connection refused
[root@localhost libexec]# echo $?
0
You can see even though it's critical, the exit code is returning as 0 so Nagios will expect this to be OK.
Now, lets turn apache back on and see what it reports -
Code: Select all
[root@localhost libexec]# service httpd start
Starting httpd: httpd: Could not reliably determine the server's fully qualified domain name, using localhost.localdomain for ServerName
[ OK ]
[root@localhost libexec]# ./negate --ok=CRITICAL --critical=OK ./check_tcp -H 127.0.0.1 -p 80
TCP OK - 0.000 second response time on 127.0.0.1 port 80|time=0.000272s;;;0.000000;10.000000
[root@localhost libexec]# echo $?
2
Now, we can see that when port 80 is open on the local machine, it is indeed reporting a CRITICAL response (2). We've flipped around the use case for Nagios by simply using the negate plugin.
Hopefully that helps explain the process.