Host shows down but services are ok?

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
bmallett
Posts: 16
Joined: Fri Feb 22, 2019 10:28 am

Host shows down but services are ok?

Post by bmallett »

I have a group of wireless APs that are being assigned IP addresses via DHCP. In order to check these, I wrote a plugin that takes their MAC Address and Parent hostname (as a DNS name), telnets the parent switch, runs 'show arp', parses the arp table to match the line with the MAC Address passed, return the IP address assigned, and then ping that IP. The service works and shows 'OK', but the host shows 'DOWN', due to 'check_ping'. I have not assigned 'check_ping' to this host or hostgroup. Is there a way to have Nagios display the host status by the result of the service? Also, why is Nagios still using 'check_ping' for something where it isn't 'assigned'?

HOST STATUS:

Host Status: DOWN (for 0d 20h 3m 38s)
Status Information: check_ping: Invalid hostname/address - AD-02-rm02-storage
Usage:
check_ping -H <host_address> -w <wrta>,<wpl>% -c <crta>,<cpl>%
[-p packets] [-t timeout] [-4
Performance Data: -6]


SERVICE STATUS:

Service State Information
Current Status: OK (for 0d 16h 58m 29s)
Status Information: (No output on stdout) stderr:
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: Host shows down but services are ok?

Post by cdienger »

A service must be assigned to a host definition and a host definition must define a check_command(which points to a plugin). It doesn't have to be directly assigned - it can be assigned through a template. If you look at the host definition you should see a "use" line. This points to a template where the check_command is likely assigned.

The host's check_command doesn't really differ from a service's, meaning you could get rid of the current service definition and modify the host definition to use the plugin you created.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
bmallett
Posts: 16
Joined: Fri Feb 22, 2019 10:28 am

Re: Host shows down but services are ok?

Post by bmallett »

Got that part working now. Thanks.

The issue I am having now is making sure I am properly passing the variables from nagios to the plugin. Every single one is returning 'OK' when it shouldn't... :evil:

I am currently doing this in the PHP:

Code: Select all

$mac = ( isset( $argv[1] ) ? $argv[1] : null );
$host = ( isset( $argv[2] ) ? $argv[2] : null );
When I call the plugin with nagios, I am calling the command like this:

Code: Select all

define command{
	command_name check_mac
	command_line $USER1$/check_mac $_HOSTMACADDRESS$ $_HOSTPARENT_DNS$
}
In the host definitions, I am adding the custom needed variables like this:

Code: Select all

_MACADDRESS	                    A4:93:4C:43:5A:DA
    _PARENT_DNS                     PS-122-3750X-01-111
Technically I could use the 'parents' field in the host definition, but I am unsure how to grab/use it.

Finally, in my service definition, I am using this:

Code: Select all

define service {
    use                         generic-service
    hostgroup_name              access-points
    servicegroups               ap-status
    service_description         Get IP from MAC ADDRESS and Ping for AP Status
    check_command               check_mac!$_HOSTMAC_ADDRESS$!$_HOSTPARENT_DNS$
}
What do I need to change in order to pass the parent field, if possible, or just the two custom fields to the php plugin?

Code: Select all

define host {
	use				            generic-access-point
	host_name			         PS-122-102
	alias				          Access Point
	display_name	            AP RM xxx
#	address				
	parents				        PS-122-3750X-01-111
	hostgroups			        access-points
  _MACADDRESS	              A4:93:4C:43:5D:AD
  _PARENT_DNS                 PS-122-3750X-01-111
	process_perf_data		    1
	icon_image			        Access-Point.png
	icon_image_alt			    Access Point
	vrml_image			        Access-Point.gd2
}
This is my first time building something for Nagios. Thanks.
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: Host shows down but services are ok?

Post by cdienger »

Try this for the command:

Code: Select all

define command{
   command_name check_mac
   command_line $USER1$/check_mac $ARG1$ $ARG2$
}
and this for the service:

Code: Select all

define service {
    use                         generic-service
    hostgroup_name              access-points
    servicegroups               ap-status
    service_description         Get IP from MAC ADDRESS and Ping for AP Status
    check_command               check_mac!$_MACADDRESS$!$_PARENT_DNS$
}
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
bmallett
Posts: 16
Joined: Fri Feb 22, 2019 10:28 am

Re: Host shows down but services are ok?

Post by bmallett »

PERFECT!

Thanks a ton!
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: Host shows down but services are ok?

Post by npolovenko »

@bmallett, Would you have any other questions for us before I lock this thread?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
bmallett
Posts: 16
Joined: Fri Feb 22, 2019 10:28 am

Re: Host shows down but services are ok?

Post by bmallett »

well... Not in regards to this, but if you want to answer it here, I will oblige... ;)

Regarding communication, specifically email notifications, I have been "hit or miss" in getting them configured the easiest way to manage. This may not be the best approach, but I am a firm believer of making jobs as easy as they can be.

That said, in order to maintain which things are triggering notifications and who receives those notifications, I attempted to have them in hostgroups. This didn't work for obvious reasons. (The docs say it doesn't.) :)

I am assuming I need to specify the flags in each individual host for them to trigger properly. If that assumption is correct, what all flags need to be added to each individual host?

I was using the following in the 'templates', but since I had the same hosts in various hostgroups, they would spit out multiple emails for each occurrence. Is that the expected functionality or did I have something else askew?

Code: Select all

ontact_groups                  admins                  ; Notifications get sent out to everyone in the 'admins' group
        notification_options            w,u,c,r                 ; Send notifications about warning, unknown, critical, and recovery events
        notification_interval           60                      ; Re-notify about service problems every hour
If there is a better way to manage notification emails or is that the best way? (individual hosts)

Thanks again for the help.
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: Host shows down but services are ok?

Post by npolovenko »

@bmallett, It's not possible to add notification options to contact groups directly. You'd need to define these settings per host or per service.

Here is the list of options you need to add to each host and service:

Host:
notification_interval 60
notification_options d,u,r,f,s
notification_period 24x7
contact_groups admins
Service:
notification_interval 60
notification_options w,u,c,r,f,s
notification_period 24x7
contact_groups admins

But there is a possible shortcut. You can add these options to your templates and have all other hosts and services use these templates.
To have your service use a template you can add this line to each service definition:
use myTemplate
Here is an example of service and host templates with notification options:
define service {
name local-service
use generic-service
max_check_attempts 4
check_interval 5
retry_interval 1
notification_interval 60
notification_options w,u,c,r,f,s
notification_period 24x7
contact_groups admins
register 0

}
define host {
name generic-host
notification_options d,u,r,f,s
notification_period 24x7
notification_interval 60
notifications_enabled 1
contact_groups admins
register 0
}
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
bmallett
Posts: 16
Joined: Fri Feb 22, 2019 10:28 am

Re: Host shows down but services are ok?

Post by bmallett »

@npolovenko

That's what I thought. I was just hoping for something different. I guess I could just make a template above the main "generic" for each "sub-group" and do it at that level. ;)

Thanks again for all your help.
User avatar
lmiltchev
Former Nagios Staff
Posts: 13587
Joined: Mon May 23, 2011 12:15 pm

Re: Host shows down but services are ok?

Post by lmiltchev »

Sounds good. I am closing this topic now. If you have any further questions, please start a new thread.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked