Page 1 of 1

Host check stuck in pending

Posted: Thu Jun 12, 2014 12:08 pm
by dc7772000
I've run into an odd issue with Host checks. If I were to add a network device via Monitoring Wizard the host checks runs normally. However if I were to add the device via CCM, the host check stays in pending status. Visually both Host configurations look identical, so I'm not sure what's different.

Grouping structure is as follows.
host/device - 24x7 (host template) - 24x7 (hostgroup) - rsa-cards (host group)
rsa-cards (host group) - RSA-24x7 (service template) - zPing (service)

Steps used
Adding network device via CCM
1) Create host via Host Management
2) Manage Templates - add 24x7
3) Manage Hostgroups - add 24x7 and rsa-cards
4) Under Check Settings - Check Interval 5 min, Retry interval 1 min and Max check attempts 5
5) Save and apply configuration

Results
Host check stuck in pending
Ping check works normally

Adding network device via Monitoring Wizard
1) Under Monitoring Wizard > Generic Network Device > Enter IP and use defaults
2) Go to CCM Host Management > Common settings > Manage Template > remove xiwizard_genericnetdevice_host and add 24x7
3) Go to CCM Host Management > Common settings > Manage Hostgroups > Add 24x7 and rsa-cards
4) Under Check Settings > Clear Check period
5) Under Alert Settings > Clear Notification period
6) Under Alert Settings > Clear Manage Contacts
7) Under Misc Settings > Clear Status image, Icon image and Manage Variable Definitions
8) Save and apply configuration

Results
Host check is working
Ping check is working

OS: CentOS 6.5 (64 bit) - installed as basic server
XI (2012R2.9) manually installed using "fullinstall" script included with tar
Customization - Gearman server & client installed

Re: Host check stuck in pending

Posted: Thu Jun 12, 2014 1:53 pm
by sreinhardt
Do you see the manually created host in the core interface (http://nagiosserver/nagios) or in the flat files(/usr/local/nagios/hosts/hostname.cfg)? While we are at it, what are the permissions on the etc directory of nagios?

Code: Select all

ll -d /usr/local/nagios/etc/services/
ll -d /usr/local/nagios/etc/hosts/
ll -d /usr/local/nagios/etc/
ll -d /usr/local/nagios

Re: Host check stuck in pending

Posted: Thu Jun 12, 2014 2:44 pm
by dc7772000
I see them in both web interface and flat files. The results of the permission check is listed below.

[root@sfo-nagiosxi01 nagios]# ll -d /usr/local/nagios/etc/services/
drwsrwsr-x. 2 apache nagios 4096 Jun 12 09:59 /usr/local/nagios/etc/services/
[root@sfo-nagiosxi01 nagios]# ll -d /usr/local/nagios/etc/hosts
drwsrwsr-x. 2 apache nagios 4096 Jun 12 09:59 /usr/local/nagios/etc/hosts
[root@sfo-nagiosxi01 nagios]# ll -d /usr/local/nagios/etc/
drwsrwsr-x. 7 apache nagios 4096 Jun 11 14:03 /usr/local/nagios/etc/
[root@sfo-nagiosxi01 nagios]# ll -d /usr/local/nagios
drwxr-xr-x. 8 root root 4096 May 13 08:48 /usr/local/nagios
[root@sfo-nagiosxi01 nagios]#

Re: Host check stuck in pending

Posted: Thu Jun 12, 2014 3:45 pm
by Box293
I think your problem is with the 24x7 host template. I don't think this template has a check command defined like in the xiwizard_genericnetdevice_host template.
Selection_057.png
If I am correct, then I think the reason why it is "working" after running the wizard is as follows:
  • Wizard is run and generic host is added
    You navigate your way to CCM > Host Management blah blah blah
    While you are making the changes, before applying config, your new generic host has been checked by Nagios and has returned an OK state
    So you've made the changes (removed template xiwizard_genericnetdevice_host and add template 24x7) and clicked apply config
    When you go back to check the host status, the host is listed as OK/UP
    I assume this host never goes down because another check is not being run so the state never changes
For one of these hosts added by the wizard (and then templates used has been changed), what is the Last Check Time of the host? Is it way back in the past? See screenshot
Selection_058.png
Also, for these hosts I doubt they have any performance graphs for ping/uptime (_HOST_).

Re: Host check stuck in pending

Posted: Thu Jun 12, 2014 4:01 pm
by dc7772000
It looks like you're 100% correct. The date and time hasn't changed since I've added the device. Also, I've tried scheduling an immediate check on a device that's shown as "up". and it didn't do anything. I've also checked the graphs and host is empty while Ping is graphing normally. I'll go ahead and make some adjustments. Thanks for your help with this.

Re: Host check stuck in pending

Posted: Thu Jun 12, 2014 4:08 pm
by Box293
Great stuff. I've had a similar issue in the past, sometimes you learn by breaking stuff :lol: