Host shows down but services are ok?
Host shows down but services are ok?
I have a group of wireless APs that are being assigned IP addresses via DHCP. In order to check these, I wrote a plugin that takes their MAC Address and Parent hostname (as a DNS name), telnets the parent switch, runs 'show arp', parses the arp table to match the line with the MAC Address passed, return the IP address assigned, and then ping that IP. The service works and shows 'OK', but the host shows 'DOWN', due to 'check_ping'. I have not assigned 'check_ping' to this host or hostgroup. Is there a way to have Nagios display the host status by the result of the service? Also, why is Nagios still using 'check_ping' for something where it isn't 'assigned'?
HOST STATUS:
Host Status: DOWN (for 0d 20h 3m 38s)
Status Information: check_ping: Invalid hostname/address - AD-02-rm02-storage
Usage:
check_ping -H <host_address> -w <wrta>,<wpl>% -c <crta>,<cpl>%
[-p packets] [-t timeout] [-4
Performance Data: -6]
SERVICE STATUS:
Service State Information
Current Status: OK (for 0d 16h 58m 29s)
Status Information: (No output on stdout) stderr:
HOST STATUS:
Host Status: DOWN (for 0d 20h 3m 38s)
Status Information: check_ping: Invalid hostname/address - AD-02-rm02-storage
Usage:
check_ping -H <host_address> -w <wrta>,<wpl>% -c <crta>,<cpl>%
[-p packets] [-t timeout] [-4
Performance Data: -6]
SERVICE STATUS:
Service State Information
Current Status: OK (for 0d 16h 58m 29s)
Status Information: (No output on stdout) stderr:
Re: Host shows down but services are ok?
A service must be assigned to a host definition and a host definition must define a check_command(which points to a plugin). It doesn't have to be directly assigned - it can be assigned through a template. If you look at the host definition you should see a "use" line. This points to a template where the check_command is likely assigned.
The host's check_command doesn't really differ from a service's, meaning you could get rid of the current service definition and modify the host definition to use the plugin you created.
The host's check_command doesn't really differ from a service's, meaning you could get rid of the current service definition and modify the host definition to use the plugin you created.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Host shows down but services are ok?
Got that part working now. Thanks.
The issue I am having now is making sure I am properly passing the variables from nagios to the plugin. Every single one is returning 'OK' when it shouldn't...
I am currently doing this in the PHP:
When I call the plugin with nagios, I am calling the command like this:
In the host definitions, I am adding the custom needed variables like this:
Technically I could use the 'parents' field in the host definition, but I am unsure how to grab/use it.
Finally, in my service definition, I am using this:
What do I need to change in order to pass the parent field, if possible, or just the two custom fields to the php plugin?
This is my first time building something for Nagios. Thanks.
The issue I am having now is making sure I am properly passing the variables from nagios to the plugin. Every single one is returning 'OK' when it shouldn't...
I am currently doing this in the PHP:
Code: Select all
$mac = ( isset( $argv[1] ) ? $argv[1] : null );
$host = ( isset( $argv[2] ) ? $argv[2] : null );
Code: Select all
define command{
command_name check_mac
command_line $USER1$/check_mac $_HOSTMACADDRESS$ $_HOSTPARENT_DNS$
}
Code: Select all
_MACADDRESS A4:93:4C:43:5A:DA
_PARENT_DNS PS-122-3750X-01-111
Finally, in my service definition, I am using this:
Code: Select all
define service {
use generic-service
hostgroup_name access-points
servicegroups ap-status
service_description Get IP from MAC ADDRESS and Ping for AP Status
check_command check_mac!$_HOSTMAC_ADDRESS$!$_HOSTPARENT_DNS$
}
Code: Select all
define host {
use generic-access-point
host_name PS-122-102
alias Access Point
display_name AP RM xxx
# address
parents PS-122-3750X-01-111
hostgroups access-points
_MACADDRESS A4:93:4C:43:5D:AD
_PARENT_DNS PS-122-3750X-01-111
process_perf_data 1
icon_image Access-Point.png
icon_image_alt Access Point
vrml_image Access-Point.gd2
}
Re: Host shows down but services are ok?
Try this for the command:
and this for the service:
Code: Select all
define command{
command_name check_mac
command_line $USER1$/check_mac $ARG1$ $ARG2$
}
Code: Select all
define service {
use generic-service
hostgroup_name access-points
servicegroups ap-status
service_description Get IP from MAC ADDRESS and Ping for AP Status
check_command check_mac!$_MACADDRESS$!$_PARENT_DNS$
}
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Host shows down but services are ok?
PERFECT!
Thanks a ton!
Thanks a ton!
-
- Support Tech
- Posts: 3457
- Joined: Mon May 15, 2017 5:00 pm
Re: Host shows down but services are ok?
@bmallett, Would you have any other questions for us before I lock this thread?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Host shows down but services are ok?
well... Not in regards to this, but if you want to answer it here, I will oblige...
Regarding communication, specifically email notifications, I have been "hit or miss" in getting them configured the easiest way to manage. This may not be the best approach, but I am a firm believer of making jobs as easy as they can be.
That said, in order to maintain which things are triggering notifications and who receives those notifications, I attempted to have them in hostgroups. This didn't work for obvious reasons. (The docs say it doesn't.)
I am assuming I need to specify the flags in each individual host for them to trigger properly. If that assumption is correct, what all flags need to be added to each individual host?
I was using the following in the 'templates', but since I had the same hosts in various hostgroups, they would spit out multiple emails for each occurrence. Is that the expected functionality or did I have something else askew?
If there is a better way to manage notification emails or is that the best way? (individual hosts)
Thanks again for the help.
Regarding communication, specifically email notifications, I have been "hit or miss" in getting them configured the easiest way to manage. This may not be the best approach, but I am a firm believer of making jobs as easy as they can be.
That said, in order to maintain which things are triggering notifications and who receives those notifications, I attempted to have them in hostgroups. This didn't work for obvious reasons. (The docs say it doesn't.)
I am assuming I need to specify the flags in each individual host for them to trigger properly. If that assumption is correct, what all flags need to be added to each individual host?
I was using the following in the 'templates', but since I had the same hosts in various hostgroups, they would spit out multiple emails for each occurrence. Is that the expected functionality or did I have something else askew?
Code: Select all
ontact_groups admins ; Notifications get sent out to everyone in the 'admins' group
notification_options w,u,c,r ; Send notifications about warning, unknown, critical, and recovery events
notification_interval 60 ; Re-notify about service problems every hour
Thanks again for the help.
-
- Support Tech
- Posts: 3457
- Joined: Mon May 15, 2017 5:00 pm
Re: Host shows down but services are ok?
@bmallett, It's not possible to add notification options to contact groups directly. You'd need to define these settings per host or per service.
Here is the list of options you need to add to each host and service:
Host:
But there is a possible shortcut. You can add these options to your templates and have all other hosts and services use these templates.
To have your service use a template you can add this line to each service definition:
Here is the list of options you need to add to each host and service:
Host:
Service:notification_interval 60
notification_options d,u,r,f,s
notification_period 24x7
contact_groups admins
notification_interval 60
notification_options w,u,c,r,f,s
notification_period 24x7
contact_groups admins
But there is a possible shortcut. You can add these options to your templates and have all other hosts and services use these templates.
To have your service use a template you can add this line to each service definition:
Here is an example of service and host templates with notification options:use myTemplate
define service {
name local-service
use generic-service
max_check_attempts 4
check_interval 5
retry_interval 1
notification_interval 60
notification_options w,u,c,r,f,s
notification_period 24x7
contact_groups admins
register 0
}
define host {
name generic-host
notification_options d,u,r,f,s
notification_period 24x7
notification_interval 60
notifications_enabled 1
contact_groups admins
register 0
}
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Host shows down but services are ok?
@npolovenko
That's what I thought. I was just hoping for something different. I guess I could just make a template above the main "generic" for each "sub-group" and do it at that level.
Thanks again for all your help.
That's what I thought. I was just hoping for something different. I guess I could just make a template above the main "generic" for each "sub-group" and do it at that level.
Thanks again for all your help.
Re: Host shows down but services are ok?
Sounds good. I am closing this topic now. If you have any further questions, please start a new thread.
Be sure to check out our Knowledgebase for helpful articles and solutions!