Page 2 of 2
Re: Services don't warn on stop/waiting
Posted: Fri Jun 19, 2015 12:13 pm
by lmiltchev
You forgot to run:
after each command... I wanted to see the exit codes from each command. Otherwise, the output looks fine.
Can you also run the following command on the remote machine and show us the output?
Code: Select all
grep check_init_service /etc/sudoers
If you are missing the line below:
Code: Select all
nagios ALL=NOPASSWD: /usr/local/nagios/libexec/check_init_service
you may add it to sudoers to see if this is going to fix the issue.
Re: Services don't warn on stop/waiting
Posted: Mon Jun 22, 2015 9:30 am
by courhar
Done both of those. Should the exit code be a 1 rather than 0 then?
Re: Services don't warn on stop/waiting
Posted: Mon Jun 22, 2015 11:55 am
by jolson
You are correct - the exit codes are specified
as follows:
0 OK UP
1 WARNING UP or DOWN/UNREACHABLE*
2 CRITICAL DOWN/UNREACHABLE
3 UNKNOWN DOWN/UNREACHABLE
Re: Services don't warn on stop/waiting
Posted: Tue Jun 23, 2015 2:33 am
by courhar
So how can I change the check_init_services script so that it gives a 1? for finding a service stopped/waiting? or even a 2 as the services are critical for what we are trying to monitor.
Re: Services don't warn on stop/waiting
Posted: Tue Jun 23, 2015 12:17 pm
by lmiltchev
Is the service actually running? Can you run the following command on the remote machine and show us the output?
If the "check_init_service" is not working with this service, perhaps you can find a different plugin or set up a custom command in order to accomplish your goal.
Re: Services don't warn on stop/waiting
Posted: Wed Jun 24, 2015 4:30 am
by courhar
Ran the command you asked. Is there another plug in you could recommend? I tried check_services which didn't alarm either.
Thanks
Courhar
Re: Services don't warn on stop/waiting
Posted: Wed Jun 24, 2015 11:56 am
by lmiltchev
You may want to try using SNMP. Please, review our documentation on the topic here:
https://assets.nagios.com/downloads/nag ... g_SNMP.pdf
An example of a command for checking the sshd process would be:
Code: Select all
/usr/local/nagios/libexec/check_snmp_process_wizard.pl -H <remote machine> -C <community string> --v2c -n 'sshd' -w '1,30' -c '1,50'
2 process matching sshd (> 1) (<= 30):OK
I am not sure if using "heat-api" as the name of the process (regexp) will produce the expected results. You may need to run snmpwalk to see what is available.