Service check pending
-
mtkaschools
- Posts: 58
- Joined: Tue Sep 14, 2010 7:53 am
Service check pending
After I upgraded to R1.3G, I get a service check pending on a few of my services. Since it seems to be pending, even though it showed up on the problem board, I can't add comments. I also seem to have a host that shows down, even though when I look at the hosts, all of them are up.
Any suggestions?
Any suggestions?
Re: Service check pending
You could try scheduling an immediate check in the host/service details pages.
The other thing you might want to do is run this from the command-line just to make sure a second nagios process didn't get spawned from the upgrade:
If the above doesn't work:
You could also double check the host or service by running it's check command from the command-line: example:
/usr/local/nagios/libexec/check_icmp -H <hostaddress>
You can use this to verify the output that you're getting.
The other thing you might want to do is run this from the command-line just to make sure a second nagios process didn't get spawned from the upgrade:
Code: Select all
killall -9 nagios
service nagios startYou could also double check the host or service by running it's check command from the command-line: example:
/usr/local/nagios/libexec/check_icmp -H <hostaddress>
You can use this to verify the output that you're getting.
-
mtkaschools
- Posts: 58
- Joined: Tue Sep 14, 2010 7:53 am
Re: Service check pending
That didn't seem to help my cause much. When I look at the host, it says host check pending, just like the services said the same thing. Even if I reboot the whole server, it doesn't clear up anything that might be going on in the background.
Something told me not to upgrade!
Something told me not to upgrade!
Re: Service check pending
What kind of output do you get when you run the check manually from the command-line?
-
mtkaschools
- Posts: 58
- Joined: Tue Sep 14, 2010 7:53 am
Re: Service check pending
What specifically would I run from the cmd line?
-
tonyyarusso
- Posts: 1128
- Joined: Wed Mar 03, 2010 12:38 pm
- Location: St. Paul, MN, USA
- Contact:
Re: Service check pending
Something in the format of
The specifics can be determined by going through the Core Config Manager to see what arguments were supplied to the check, and filling them in as appropriate.
Code: Select all
/usr/local/nagios/libexec/check_icmp -H <hostaddress>-
mtkaschools
- Posts: 58
- Joined: Tue Sep 14, 2010 7:53 am
Re: Service check pending
it comes back with 'OK', but yet still in the nagiosXI interface, it shows 'service check is pending...'
Re: Service check pending
Can you run the following commands on the command line and send us the output from it?
I'd like to see if Nagios has "orphaned" checks.
Code: Select all
tail -50 /usr/local/nagios/var/nagios.log-
mtkaschools
- Posts: 58
- Joined: Tue Sep 14, 2010 7:53 am
Re: Service check pending
Code: Select all
login as: root
[email protected]'s password:
Access denied
[email protected]'s password:
Last login: Wed Dec 15 11:54:48 2010
[root@noc ~]# tail -50 /usr/local/nagios/var/nagios.log
[1292586822] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292586942] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292587062] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292587182] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292587302] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292587422] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292587542] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292587662] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292587782] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292587833] Auto-save of retention data completed successfully.
[1292587902] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292588022] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292588142] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292588262] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292588382] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292588502] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292588622] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292588742] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292588862] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292588982] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292589102] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292589222] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292589352] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: rta nan, lost 100%
[1292589472] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: rta nan, lost 100%
[1292589582] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292589702] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292589822] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;CRITICAL;HARD;1;CRITICAL - 10.10.220.31: Net unreachable @ 10.5.1.1. rta nan, lost 100%
[1292589942] SERVICE ALERT: Printer - MHS-1422-LJ2015;Ping;OK;HARD;1;OK - 10.10.220.31: rta 0.367ms, lost 0%
[1292589952] HOST ALERT: Printer - MHS-1422-LJ2015;UP;HARD;1;OK - 10.10.220.31: rta 11.786ms, lost 0%
[1292589962] SERVICE ALERT: Printer - MHS-1422-LJ2015;Printer Status;OK;HARD;2;Printer ok - ("Ready")
[1292591433] Auto-save of retention data completed successfully.
[1292591972] SERVICE ALERT: Server - SHAREPOINTDB;CPU Usage;CRITICAL;SOFT;1;Connection reset by peer
[1292592032] SERVICE ALERT: Server - SHAREPOINTDB;CPU Usage;OK;SOFT;2;CPU Load 1% (5 min average)
[1292595033] Auto-save of retention data completed successfully.
[1292598172] SERVICE ALERT: Server - SHAREPOINT1;Service - Sophos Anti-Virus;WARNING;SOFT;1;could not fetch information from server
[1292598232] SERVICE ALERT: Server - SHAREPOINT1;Service - Sophos Anti-Virus;OK;SOFT;2;SAVService: Started
[1292598633] Auto-save of retention data completed successfully.
[1292600972] SERVICE ALERT: Printer - DSC-TL-LJ3800-COLOR;Printer Status;WARNING;SOFT;1;Printer Offline ("Checking printer")
[1292600972] SERVICE ALERT: Printer - DSC-TL-LJ3800-BLACK;Printer Status;WARNING;SOFT;1;Printer Offline ("Checking printer")
[1292601032] SERVICE ALERT: Printer - DSC-TL-LJ3800-COLOR;Printer Status;OK;SOFT;2;Printer ok - ("Processing job from tray 2")
[1292601032] SERVICE ALERT: Printer - DSC-TL-LJ3800-BLACK;Printer Status;OK;SOFT;2;Printer ok - ("Processing job from tray 2")
[1292602233] Auto-save of retention data completed successfully.
[1292604932] SERVICE ALERT: Server - SHAREPOINT1;Uptime;WARNING;SOFT;1;could not fetch information from server
[1292604992] SERVICE ALERT: Server - SHAREPOINT1;Uptime;OK;SOFT;2;System Uptime - 2 day(s) 20 hour(s) 55 minute(s)
[1292605833] Auto-save of retention data completed successfully.
[1292609433] Auto-save of retention data completed successfully.
[1292611662] SERVICE ALERT: Server - BB2;Service - Sophos Agent;CRITICAL;SOFT;1;Connection reset by peer
[1292611722] SERVICE ALERT: Server - BB2;Service - Sophos Agent;OK;SOFT;2;Sophos Agent: Started
[1292613033] Auto-save of retention data completed successfully.
[1292616633] Auto-save of retention data completed successfully.
[root@noc ~]#Re: Service check pending
Can I have you check to see if the Nagios Core is showing the same issue? Access the core interface by going to http://<yourserver>/nagios. I want to see if the issue is with Core or related to the Xi interface.
Is there any relationship between the checks that are coming back as pending? (For example, are they all using the same check command?)
Is there any relationship between the checks that are coming back as pending? (For example, are they all using the same check command?)