Page 2 of 2
Re: Host shows down but services are ok?
Posted: Thu Apr 25, 2019 1:23 pm
by bmallett
Ok, so what I thought was fixed is not. I was getting undesired results and turned on debugging for plugins/macros.
My `command` and `service` are as suggested above, but it appears the macro/vars aren't being parsed.
Here is the log:
Code: Select all
1556215708.123001] [2048.1] [pid=5607] **** BEGIN MACRO PROCESSING ***********
[1556215708.123004] [2048.1] [pid=5607] Processing: '$USER1$/check_mac $ARG1$ $ARG2$'
[1556215708.123009] [2048.1] [pid=5607] Done. Final output: '/usr/local/nagios/libexec/check_mac $_MACADDRESS$ $_PARENT_DNS$'
[1556215708.123012] [2048.1] [pid=5607] **** END MACRO PROCESSING *************
What am I missing?
Re: Host shows down but services are ok?
Posted: Thu Apr 25, 2019 3:01 pm
by lmiltchev
@cdienger was correct that you need to have $ARG1$ and $ARG2$ in your command, but he forgot to prepend the macros with "HOST". So, you should have:
Code: Select all
define command{
command_name check_mac
command_line $USER1$/check_mac $ARG1$ $ARG2$
}
and
Code: Select all
define service {
use generic-service
hostgroup_name access-points
servicegroups ap-status
service_description Get IP from MAC ADDRESS and Ping for AP Status
check_command check_mac!$_HOSTMACADDRESS$!$_HOSTPARENT_DNS$
}
Let us know if this worked for you.
Re: Host shows down but services are ok?
Posted: Thu Apr 25, 2019 3:13 pm
by bmallett
That did work. I switched to that just prior to receiving this notice. However, I am still seeing some odd results. (see dubug)
DEBUG:
Code: Select all
[1556222489.844414] [2048.1] [pid=14359] **** BEGIN MACRO PROCESSING ***********
[1556222489.844421] [2048.1] [pid=14359] Processing: '$_HOSTMACADDRESS$'
[1556222489.844428] [2048.1] [pid=14359] Done. Final output: 'a4:93:4c:c1:27:9f'
[1556222489.844432] [2048.1] [pid=14359] **** END MACRO PROCESSING *************
[1556222489.844437] [2048.1] [pid=14359] **** BEGIN MACRO PROCESSING ***********
[1556222489.844441] [2048.1] [pid=14359] Processing: '$_HOSTPARENT_DNS$'
[1556222489.844447] [2048.1] [pid=14359] Done. Final output: 'HS-402-3560-01-1517'
[1556222489.844470] [2048.1] [pid=14359] **** END MACRO PROCESSING *************
[1556222489.844474] [2048.1] [pid=14359] **** BEGIN MACRO PROCESSING ***********
[1556222489.844478] [2048.1] [pid=14359] Processing: '$USER1$/check_mac $ARG1$ $ARG2$'
[1556222489.844485] [2048.1] [pid=14359] Done. Final output: '/usr/local/nagios/libexec/check_mac a4:93:4c:c1:27:9f HS-402-3560-01-1517'
[1556222489.844490] [2048.1] [pid=14359] **** END MACRO PROCESSING *************
[1556222489.884200] [2048.1] [pid=14359] **** BEGIN MACRO PROCESSING ***********
[1556222489.884213] [2048.1] [pid=14359] Processing: '$_HOSTMACADDRESS$'
[1556222489.884223] [2048.1] [pid=14359] Done. Final output: 'a4:93:4c:b2:57:83'
[1556222489.884227] [2048.1] [pid=14359] **** END MACRO PROCESSING *************
[1556222489.884232] [2048.1] [pid=14359] **** BEGIN MACRO PROCESSING ***********
[1556222489.884236] [2048.1] [pid=14359] Processing: '$_HOSTPARENT_DNS$'
[1556222489.884242] [2048.1] [pid=14359] Done. Final output: 'ES-124B-2960X-01-132'
[1556222489.884246] [2048.1] [pid=14359] **** END MACRO PROCESSING *************
[1556222489.919445] [2048.1] [pid=14359] **** BEGIN MACRO PROCESSING ***********
[1556222489.919467] [2048.1] [pid=14359] Processing: '$USER1$/check_mac $ARG1$ $ARG2$'
[1556222489.919483] [2048.1] [pid=14359] Done. Final output: '/usr/local/nagios/libexec/check_mac '
[1556222489.919494] [2048.1] [pid=14359] **** END MACRO PROCESSING *************
[1556222489.927713] [2048.1] [pid=14359] **** BEGIN MACRO PROCESSING ***********
[1556222489.927740] [2048.1] [pid=14359] Processing: '$USER1$/check_mac $ARG1$ $ARG2$'
[1556222489.927752] [2048.1] [pid=14359] Done. Final output: '/usr/local/nagios/libexec/check_mac '
[1556222489.927758] [2048.1] [pid=14359] **** END MACRO PROCESSING *************
[1556222489.927816] [2048.1] [pid=14359] **** BEGIN MACRO PROCESSING ***********
[1556222489.927823] [2048.1] [pid=14359] Processing: '$USER1$/check_mac $ARG1$ $ARG2$'
[1556222489.927828] [2048.1] [pid=14359] Done. Final output: '/usr/local/nagios/libexec/check_mac '
[1556222489.927831] [2048.1] [pid=14359] **** END MACRO PROCESSING *************
[1556222489.939980] [2048.1] [pid=14359] **** BEGIN MACRO PROCESSING ***********
[1556222489.940013] [2048.1] [pid=14359] Processing: '$USER1$/check_mac $ARG1$ $ARG2$'
[1556222489.940026] [2048.1] [pid=14359] Done. Final output: '/usr/local/nagios/libexec/check_mac '
[1556222489.940032] [2048.1] [pid=14359] **** END MACRO PROCESSING *************
It appears that it works sometimes and not sometimes. What would cause the inconsistency? I have verified that the host definitions are complete for the ones missing data.
Lastly, these all reset fine from a HARD down state in the service section, but I am also using this for the host check command. They are all currently locked in a 10/10 HARD state. I can run them via CLI and, as mentioned, the service is running without error. How do I reset the host 10/10 HARD state so that they can show OK again?
Re: Host shows down but services are ok?
Posted: Fri Apr 26, 2019 2:32 pm
by cdienger
I'd be curios to see what's in the /usr/local/nagios/var/objects.cache for the hosts and services with these checks to make sure that all the variables are getting set correctly. I'd also try removing the hosts and services and verify they're removed in the web ui before adding them back in.
Re: Host shows down but services are ok?
Posted: Fri Apr 26, 2019 2:45 pm
by bmallett
The HOST check shows failed but the service shows OK.
Here is a host:
Code: Select all
define host {
host_name AD-02-rm02-storage
display_name Admin Storage Room AP
alias Access Point in Administration Storage
address AD-02-rm02-storage
parents AD-02-3560-00-102
check_period 24x7
check_command check_mac
notification_period 24x7
initial_state o
importance 0
check_interval 2.000000
retry_interval 1.000000
max_check_attempts 10
active_checks_enabled 1
passive_checks_enabled 1
obsess 1
event_handler_enabled 1
low_flap_threshold 0.000000
high_flap_threshold 0.000000
flap_detection_enabled 1
flap_detection_options a
freshness_threshold 0
check_freshness 0
notification_options a
notifications_enabled 1
notification_interval 30.000000
first_notification_delay 0.000000
stalking_options n
process_perf_data 1
icon_image Network-Access-Point.png
icon_image_alt Access Point
vrml_image Network-Access-Point.gd2
retain_status_information 1
retain_nonstatus_information 1
_MACADDRESS a4:93:4c:43:58:39
_PARENT_DNS AD-02-3560-00-102
}
Here is that service:
Code: Select all
define service {
host_name AD-02-rm02-storage
service_description Get IP from MAC ADDRESS and Ping for AP Status
check_period 24x7
check_command check_mac!$_HOSTMACADDRESS$!$_HOSTPARENT_DNS$
notification_period 24x7
initial_state o
importance 0
check_interval 10.000000
retry_interval 2.000000
max_check_attempts 3
is_volatile 1
parallelize_check 1
active_checks_enabled 1
passive_checks_enabled 1
obsess 1
event_handler_enabled 1
low_flap_threshold 0.000000
high_flap_threshold 0.000000
flap_detection_enabled 1[img][/img]
flap_detection_options a
freshness_threshold 0
check_freshness 0
notification_options a
notifications_enabled 1
notification_interval 30.000000
first_notification_delay 0.000000
stalking_options n
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
}
https://ibb.co/Y3zH6tB
Re: Host shows down but services are ok?
Posted: Fri Apr 26, 2019 3:00 pm
by bmallett
I did cut in some "helpful" messages to see where/what was failing. This one says the following:
No MAC Address supplied from HOST.
https://ibb.co/vwVvV0g
SERVICE:
https://ibb.co/kXFrRTs
Re: Host shows down but services are ok?
Posted: Fri Apr 26, 2019 3:40 pm
by cdienger
Can you PM me a copy of the config(/usr/local/nagios/etc/) ? We haven't been able to reproduce the problem and I'd like to lab it up with your config.
Re: Host shows down but services are ok?
Posted: Tue Apr 30, 2019 3:29 pm
by cdienger
Received the data and I think the problem here is that the host template doesn't use the ARG options when defining the command. Edit /usr/local/nagios/etc/objects/templates.cfg and change:
Code: Select all
define host {
name generic-access-point ; The name of this host template
use generic-host ; Inherit default values from the generic-host template
check_period 24x7 ; By default, switches are monitored round the clock
check_interval 2 ; Switches are checked every 5 minutes
retry_interval 1 ; Schedule host check retries at 1 minute intervals
max_check_attempts 10 ; Check each switch 10 times (max)
check_command check_mac ; Default command to check if access points are "alive"
notification_period 24x7 ; Send notifications at any time
# notification_interval 30 ; Resend notifications every 30 minutes
# notification_options d,r ; Only send notifications for specific host states
# contact_groups admins ; Notifications get sent to the admins by default
register 0 ; DON'T REGISTER THIS - ITS JUST A TEMPLATE
}
to:
Code: Select all
define host {
name generic-access-point ; The name of this host template
use generic-host ; Inherit default values from the generic-host template
check_period 24x7 ; By default, switches are monitored round the clock
check_interval 2 ; Switches are checked every 5 minutes
retry_interval 1 ; Schedule host check retries at 1 minute intervals
max_check_attempts 10 ; Check each switch 10 times (max)
check_command check_mac!$_HOSTMACADDRESS$!$_HOSTPARENT_DNS$ ; Default command to check if access points are "alive"
notification_period 24x7 ; Send notifications at any time
# notification_interval 30 ; Resend notifications every 30 minutes
# notification_options d,r ; Only send notifications for specific host states
# contact_groups admins ; Notifications get sent to the admins by default
register 0 ; DON'T REGISTER THIS - ITS JUST A TEMPLATE
}
Re: Host shows down but services are ok?
Posted: Wed May 01, 2019 7:51 am
by bmallett
@cdienger
Correct. That resolved the issue. Thanks a bunch. I verified this this morning first thing.
Re: Host shows down but services are ok?
Posted: Wed May 01, 2019 8:43 am
by scottwilkerson
bmallett wrote:@cdienger
Correct. That resolved the issue. Thanks a bunch. I verified this this morning first thing.
great!
Locking thread