Page 1 of 2
Service check pending
Posted: Mon Dec 13, 2010 11:59 am
by mtkaschools
After I upgraded to R1.3G, I get a service check pending on a few of my services. Since it seems to be pending, even though it showed up on the problem board, I can't add comments. I also seem to have a host that shows down, even though when I look at the hosts, all of them are up.
Any suggestions?
Re: Service check pending
Posted: Mon Dec 13, 2010 12:14 pm
by mguthrie
You could try scheduling an immediate check in the host/service details pages.
The other thing you might want to do is run this from the command-line just to make sure a second nagios process didn't get spawned from the upgrade:
Code: Select all
killall -9 nagios
service nagios start
If the above doesn't work:
You could also double check the host or service by running it's check command from the command-line: example:
/usr/local/nagios/libexec/check_icmp -H <hostaddress>
You can use this to verify the output that you're getting.
Re: Service check pending
Posted: Mon Dec 13, 2010 1:28 pm
by mtkaschools
That didn't seem to help my cause much. When I look at the host, it says host check pending, just like the services said the same thing. Even if I reboot the whole server, it doesn't clear up anything that might be going on in the background.
Something told me not to upgrade!
Re: Service check pending
Posted: Mon Dec 13, 2010 5:12 pm
by mguthrie
What kind of output do you get when you run the check manually from the command-line?
Re: Service check pending
Posted: Mon Dec 13, 2010 7:24 pm
by mtkaschools
What specifically would I run from the cmd line?
Re: Service check pending
Posted: Tue Dec 14, 2010 10:50 am
by tonyyarusso
Something in the format of
Code: Select all
/usr/local/nagios/libexec/check_icmp -H <hostaddress>
The specifics can be determined by going through the Core Config Manager to see what arguments were supplied to the check, and filling them in as appropriate.
Re: Service check pending
Posted: Wed Dec 15, 2010 12:59 pm
by mtkaschools
it comes back with 'OK', but yet still in the nagiosXI interface, it shows 'service check is pending...'
Re: Service check pending
Posted: Wed Dec 15, 2010 2:24 pm
by mguthrie
Can you run the following commands on the command line and send us the output from it?
Code: Select all
tail -50 /usr/local/nagios/var/nagios.log
I'd like to see if Nagios has "orphaned" checks.
Re: Service check pending
Posted: Fri Dec 17, 2010 3:48 pm
by mtkaschools
Code: Select all
login as: root
[email protected]'s password:
Access denied
[email protected]'s password:
Last login: Wed Dec 15 11:54:48 2010
[root@noc ~]# tail -50 /usr/local/nagios/var/nagios.log
[1292586822] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292586942] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292587062] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292587182] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292587302] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292587422] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292587542] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292587662] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292587782] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292587833] Auto-save of retention data completed successfully.
[1292587902] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292588022] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292588142] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292588262] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292588382] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292588502] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292588622] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292588742] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292588862] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292588982] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292589102] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292589222] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292589352] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: rta nan, lost 100%
[1292589472] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: rta nan, lost 100%
[1292589582] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292589702] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292589822] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292589942] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;OK;HARD;1;OK - 10.10.220.31: rta 0.367ms, lost 0%
[1292589952] HOST ALERT: Printer - MHS-1422-LJ2015;UP;HARD;1;OK - 10.10.220.31: rta 11.786ms, lost 0%
[1292589962] SERVICE ALERT: Printer - MHS-1422-LJ2015;Printer Status;OK;HARD;2;Printer ok - ("Ready")
[1292591433] Auto-save of retention data completed successfully.
[1292591972] SERVICE ALERT: Server - SHAREPOINTDB;CPU Usage;CRITICAL;SOFT;1;Connection reset by peer
[1292592032] SERVICE ALERT: Server - SHAREPOINTDB;CPU Usage;OK;SOFT;2;CPU Load 1% (5 min average)
[1292595033] Auto-save of retention data completed successfully.
[1292598172] SERVICE ALERT: Server - SHAREPOINT1;Service - Sophos Anti-Virus;WARNING;SOFT;1;could not fetch information from server
[1292598232] SERVICE ALERT: Server - SHAREPOINT1;Service - Sophos Anti-Virus;OK;SOFT;2;SAVService: Started
[1292598633] Auto-save of retention data completed successfully.
[1292600972] SERVICE ALERT: Printer - DSC-TL-LJ3800-COLOR;Printer Status;WARNING;SOFT;1;Printer Offline ("Checking printer")
[1292600972] SERVICE ALERT: Printer - DSC-TL-LJ3800-BLACK;Printer Status;WARNING;SOFT;1;Printer Offline ("Checking printer")
[1292601032] SERVICE ALERT: Printer - DSC-TL-LJ3800-COLOR;Printer Status;OK;SOFT;2;Printer ok - ("Processing job from tray 2")
[1292601032] SERVICE ALERT: Printer - DSC-TL-LJ3800-BLACK;Printer Status;OK;SOFT;2;Printer ok - ("Processing job from tray 2")
[1292602233] Auto-save of retention data completed successfully.
[1292604932] SERVICE ALERT: Server - SHAREPOINT1;Uptime;WARNING;SOFT;1;could not fetch information from server
[1292604992] SERVICE ALERT: Server - SHAREPOINT1;Uptime;OK;SOFT;2;System Uptime - 2 day(s) 20 hour(s) 55 minute(s)
[1292605833] Auto-save of retention data completed successfully.
[1292609433] Auto-save of retention data completed successfully.
[1292611662] SERVICE ALERT: Server - BB2;Service - Sophos Agent;CRITICAL;SOFT;1;Connection reset by peer
[1292611722] SERVICE ALERT: Server - BB2;Service - Sophos Agent;OK;SOFT;2;Sophos Agent: Started
[1292613033] Auto-save of retention data completed successfully.
[1292616633] Auto-save of retention data completed successfully.
[root@noc ~]#
Re: Service check pending
Posted: Mon Dec 20, 2010 12:25 pm
by mguthrie
Can I have you check to see if the Nagios Core is showing the same issue? Access the core interface by going to http://<yourserver>/nagios. I want to see if the issue is with Core or related to the Xi interface.
Is there any relationship between the checks that are coming back as pending? (For example, are they all using the same check command?)