Host shows down but services are ok?

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
bmallett
Posts: 16
Joined: Fri Feb 22, 2019 10:28 am

Re: Host shows down but services are ok?

Post by bmallett »

Ok, so what I thought was fixed is not. I was getting undesired results and turned on debugging for plugins/macros.

My `command` and `service` are as suggested above, but it appears the macro/vars aren't being parsed.

Here is the log:

Code: Select all

1556215708.123001] [2048.1] [pid=5607] **** BEGIN MACRO PROCESSING ***********
[1556215708.123004] [2048.1] [pid=5607] Processing: '$USER1$/check_mac $ARG1$ $ARG2$'
[1556215708.123009] [2048.1] [pid=5607]   Done.  Final output: '/usr/local/nagios/libexec/check_mac $_MACADDRESS$ $_PARENT_DNS$'
[1556215708.123012] [2048.1] [pid=5607] **** END MACRO PROCESSING *************
What am I missing?
User avatar
lmiltchev
Former Nagios Staff
Posts: 13587
Joined: Mon May 23, 2011 12:15 pm

Re: Host shows down but services are ok?

Post by lmiltchev »

@cdienger was correct that you need to have $ARG1$ and $ARG2$ in your command, but he forgot to prepend the macros with "HOST". So, you should have:

Code: Select all

define command{
   command_name check_mac
   command_line $USER1$/check_mac $ARG1$ $ARG2$
}
and

Code: Select all

define service {
    use                         generic-service
    hostgroup_name              access-points
    servicegroups               ap-status
    service_description         Get IP from MAC ADDRESS and Ping for AP Status
    check_command               check_mac!$_HOSTMACADDRESS$!$_HOSTPARENT_DNS$
}
Let us know if this worked for you.
Be sure to check out our Knowledgebase for helpful articles and solutions!
bmallett
Posts: 16
Joined: Fri Feb 22, 2019 10:28 am

Re: Host shows down but services are ok?

Post by bmallett »

That did work. I switched to that just prior to receiving this notice. However, I am still seeing some odd results. (see dubug)

DEBUG:

Code: Select all

[1556222489.844414] [2048.1] [pid=14359] **** BEGIN MACRO PROCESSING ***********
[1556222489.844421] [2048.1] [pid=14359] Processing: '$_HOSTMACADDRESS$'
[1556222489.844428] [2048.1] [pid=14359]   Done.  Final output: 'a4:93:4c:c1:27:9f'
[1556222489.844432] [2048.1] [pid=14359] **** END MACRO PROCESSING *************
[1556222489.844437] [2048.1] [pid=14359] **** BEGIN MACRO PROCESSING ***********
[1556222489.844441] [2048.1] [pid=14359] Processing: '$_HOSTPARENT_DNS$'
[1556222489.844447] [2048.1] [pid=14359]   Done.  Final output: 'HS-402-3560-01-1517'
[1556222489.844470] [2048.1] [pid=14359] **** END MACRO PROCESSING *************
[1556222489.844474] [2048.1] [pid=14359] **** BEGIN MACRO PROCESSING ***********
[1556222489.844478] [2048.1] [pid=14359] Processing: '$USER1$/check_mac $ARG1$ $ARG2$'
[1556222489.844485] [2048.1] [pid=14359]   Done.  Final output: '/usr/local/nagios/libexec/check_mac a4:93:4c:c1:27:9f HS-402-3560-01-1517'
[1556222489.844490] [2048.1] [pid=14359] **** END MACRO PROCESSING *************
[1556222489.884200] [2048.1] [pid=14359] **** BEGIN MACRO PROCESSING ***********
[1556222489.884213] [2048.1] [pid=14359] Processing: '$_HOSTMACADDRESS$'
[1556222489.884223] [2048.1] [pid=14359]   Done.  Final output: 'a4:93:4c:b2:57:83'
[1556222489.884227] [2048.1] [pid=14359] **** END MACRO PROCESSING *************
[1556222489.884232] [2048.1] [pid=14359] **** BEGIN MACRO PROCESSING ***********
[1556222489.884236] [2048.1] [pid=14359] Processing: '$_HOSTPARENT_DNS$'
[1556222489.884242] [2048.1] [pid=14359]   Done.  Final output: 'ES-124B-2960X-01-132'
[1556222489.884246] [2048.1] [pid=14359] **** END MACRO PROCESSING *************
[1556222489.919445] [2048.1] [pid=14359] **** BEGIN MACRO PROCESSING ***********
[1556222489.919467] [2048.1] [pid=14359] Processing: '$USER1$/check_mac $ARG1$ $ARG2$'
[1556222489.919483] [2048.1] [pid=14359]   Done.  Final output: '/usr/local/nagios/libexec/check_mac  '
[1556222489.919494] [2048.1] [pid=14359] **** END MACRO PROCESSING *************
[1556222489.927713] [2048.1] [pid=14359] **** BEGIN MACRO PROCESSING ***********
[1556222489.927740] [2048.1] [pid=14359] Processing: '$USER1$/check_mac $ARG1$ $ARG2$'
[1556222489.927752] [2048.1] [pid=14359]   Done.  Final output: '/usr/local/nagios/libexec/check_mac  '
[1556222489.927758] [2048.1] [pid=14359] **** END MACRO PROCESSING *************
[1556222489.927816] [2048.1] [pid=14359] **** BEGIN MACRO PROCESSING ***********
[1556222489.927823] [2048.1] [pid=14359] Processing: '$USER1$/check_mac $ARG1$ $ARG2$'
[1556222489.927828] [2048.1] [pid=14359]   Done.  Final output: '/usr/local/nagios/libexec/check_mac  '
[1556222489.927831] [2048.1] [pid=14359] **** END MACRO PROCESSING *************
[1556222489.939980] [2048.1] [pid=14359] **** BEGIN MACRO PROCESSING ***********
[1556222489.940013] [2048.1] [pid=14359] Processing: '$USER1$/check_mac $ARG1$ $ARG2$'
[1556222489.940026] [2048.1] [pid=14359]   Done.  Final output: '/usr/local/nagios/libexec/check_mac  '
[1556222489.940032] [2048.1] [pid=14359] **** END MACRO PROCESSING *************
It appears that it works sometimes and not sometimes. What would cause the inconsistency? I have verified that the host definitions are complete for the ones missing data.

Lastly, these all reset fine from a HARD down state in the service section, but I am also using this for the host check command. They are all currently locked in a 10/10 HARD state. I can run them via CLI and, as mentioned, the service is running without error. How do I reset the host 10/10 HARD state so that they can show OK again?
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: Host shows down but services are ok?

Post by cdienger »

I'd be curios to see what's in the /usr/local/nagios/var/objects.cache for the hosts and services with these checks to make sure that all the variables are getting set correctly. I'd also try removing the hosts and services and verify they're removed in the web ui before adding them back in.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
bmallett
Posts: 16
Joined: Fri Feb 22, 2019 10:28 am

Re: Host shows down but services are ok?

Post by bmallett »

The HOST check shows failed but the service shows OK.

Here is a host:

Code: Select all

define host {
	host_name	AD-02-rm02-storage
	display_name	Admin Storage Room AP
	alias	Access Point in Administration Storage
	address	AD-02-rm02-storage
	parents	AD-02-3560-00-102
	check_period	24x7
	check_command	check_mac
	notification_period	24x7
	initial_state	o
	importance	0
	check_interval	2.000000
	retry_interval	1.000000
	max_check_attempts	10
	active_checks_enabled	1
	passive_checks_enabled	1
	obsess	1
	event_handler_enabled	1
	low_flap_threshold	0.000000
	high_flap_threshold	0.000000
	flap_detection_enabled	1
	flap_detection_options	a
	freshness_threshold	0
	check_freshness	0
	notification_options	a
	notifications_enabled	1
	notification_interval	30.000000
	first_notification_delay	0.000000
	stalking_options	n
	process_perf_data	1
	icon_image        Network-Access-Point.png
	icon_image_alt	Access Point
	vrml_image	 Network-Access-Point.gd2
	retain_status_information	1
	retain_nonstatus_information	1
	_MACADDRESS	a4:93:4c:43:58:39
	_PARENT_DNS	AD-02-3560-00-102
	}
Here is that service:

Code: Select all

define service {
	host_name	AD-02-rm02-storage
	service_description	Get IP from MAC ADDRESS and Ping for AP Status
	check_period	24x7
	check_command	check_mac!$_HOSTMACADDRESS$!$_HOSTPARENT_DNS$
	notification_period	24x7
	initial_state	o
	importance	0
	check_interval	10.000000
	retry_interval	2.000000
	max_check_attempts	3
	is_volatile	1
	parallelize_check	1
	active_checks_enabled	1
	passive_checks_enabled	1
	obsess	1
	event_handler_enabled	1
	low_flap_threshold	0.000000
	high_flap_threshold	0.000000
	flap_detection_enabled	1[img][/img]
	flap_detection_options	a
	freshness_threshold	0
	check_freshness	0
	notification_options	a
	notifications_enabled	1
	notification_interval	30.000000
	first_notification_delay	0.000000
	stalking_options	n
	process_perf_data	1
	retain_status_information	1
	retain_nonstatus_information	1
	}

Image

https://ibb.co/Y3zH6tB
bmallett
Posts: 16
Joined: Fri Feb 22, 2019 10:28 am

Re: Host shows down but services are ok?

Post by bmallett »

I did cut in some "helpful" messages to see where/what was failing. This one says the following:
No MAC Address supplied from HOST.
Image

https://ibb.co/vwVvV0g



SERVICE:
Image

https://ibb.co/kXFrRTs
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: Host shows down but services are ok?

Post by cdienger »

Can you PM me a copy of the config(/usr/local/nagios/etc/) ? We haven't been able to reproduce the problem and I'd like to lab it up with your config.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: Host shows down but services are ok?

Post by cdienger »

Received the data and I think the problem here is that the host template doesn't use the ARG options when defining the command. Edit /usr/local/nagios/etc/objects/templates.cfg and change:

Code: Select all

define host {
        name                            generic-access-point    ; The name of this host template
        use                             generic-host            ; Inherit default values from the generic-host template
        check_period                    24x7                    ; By default, switches are monitored round the clock
        check_interval                  2                       ; Switches are checked every 5 minutes
        retry_interval                  1                       ; Schedule host check retries at 1 minute intervals
        max_check_attempts              10                      ; Check each switch 10 times (max)
        check_command                   check_mac               ; Default command to check if access points are "alive"
        notification_period             24x7                    ; Send notifications at any time
#        notification_interval           30                      ; Resend notifications every 30 minutes
#        notification_options            d,r                     ; Only send notifications for specific host states
#        contact_groups                  admins                  ; Notifications get sent to the admins by default
        register                        0                       ; DON'T REGISTER THIS - ITS JUST A TEMPLATE
}
to:

Code: Select all

define host {
        name                            generic-access-point    ; The name of this host template
        use                             generic-host            ; Inherit default values from the generic-host template
        check_period                    24x7                    ; By default, switches are monitored round the clock
        check_interval                  2                       ; Switches are checked every 5 minutes
        retry_interval                  1                       ; Schedule host check retries at 1 minute intervals
        max_check_attempts              10                      ; Check each switch 10 times (max)
        check_command                   check_mac!$_HOSTMACADDRESS$!$_HOSTPARENT_DNS$               ; Default command to check if access points are "alive"
        notification_period             24x7                    ; Send notifications at any time
#        notification_interval           30                      ; Resend notifications every 30 minutes
#        notification_options            d,r                     ; Only send notifications for specific host states
#        contact_groups                  admins                  ; Notifications get sent to the admins by default
        register                        0                       ; DON'T REGISTER THIS - ITS JUST A TEMPLATE
}
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
bmallett
Posts: 16
Joined: Fri Feb 22, 2019 10:28 am

Re: Host shows down but services are ok?

Post by bmallett »

@cdienger

Correct. That resolved the issue. Thanks a bunch. I verified this this morning first thing.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Host shows down but services are ok?

Post by scottwilkerson »

bmallett wrote:@cdienger

Correct. That resolved the issue. Thanks a bunch. I verified this this morning first thing.
great!

Locking thread
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
Locked