Host shows down but services are ok?

This forum is intended for the discussion of Nagios plugin development. Feature requests, patches, bug fixes, and all types of development-related discussions are welcome!

NOTE: The SourceForge.net nagiosplug-devel mailing list has been deprecated in favor of this forum in order to expedite support and provide additional features not available on the old mailing list.

Re: Host shows down but services are ok?

Postby bmallett » Thu Apr 25, 2019 1:23 pm

Ok, so what I thought was fixed is not. I was getting undesired results and turned on debugging for plugins/macros.

My `command` and `service` are as suggested above, but it appears the macro/vars aren't being parsed.

Here is the log:

Code: Select all
1556215708.123001] [2048.1] [pid=5607] **** BEGIN MACRO PROCESSING ***********
[1556215708.123004] [2048.1] [pid=5607] Processing: '$USER1$/check_mac $ARG1$ $ARG2$'
[1556215708.123009] [2048.1] [pid=5607]   Done.  Final output: '/usr/local/nagios/libexec/check_mac $_MACADDRESS$ $_PARENT_DNS$'
[1556215708.123012] [2048.1] [pid=5607] **** END MACRO PROCESSING *************


What am I missing?
bmallett
 
Posts: 16
Joined: Fri Feb 22, 2019 10:28 am

Re: Host shows down but services are ok?

Postby lmiltchev » Thu Apr 25, 2019 3:01 pm

@cdienger was correct that you need to have $ARG1$ and $ARG2$ in your command, but he forgot to prepend the macros with "HOST". So, you should have:

Code: Select all
define command{
   command_name check_mac
   command_line $USER1$/check_mac $ARG1$ $ARG2$
}


and

Code: Select all
define service {
    use                         generic-service
    hostgroup_name              access-points
    servicegroups               ap-status
    service_description         Get IP from MAC ADDRESS and Ping for AP Status
    check_command               check_mac!$_HOSTMACADDRESS$!$_HOSTPARENT_DNS$
}


Let us know if this worked for you.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
User avatar
lmiltchev
QA Manager
 
Posts: 12411
Joined: Mon May 23, 2011 12:15 pm

Re: Host shows down but services are ok?

Postby bmallett » Thu Apr 25, 2019 3:13 pm

That did work. I switched to that just prior to receiving this notice. However, I am still seeing some odd results. (see dubug)

DEBUG:
Code: Select all
[1556222489.844414] [2048.1] [pid=14359] **** BEGIN MACRO PROCESSING ***********
[1556222489.844421] [2048.1] [pid=14359] Processing: '$_HOSTMACADDRESS$'
[1556222489.844428] [2048.1] [pid=14359]   Done.  Final output: 'a4:93:4c:c1:27:9f'
[1556222489.844432] [2048.1] [pid=14359] **** END MACRO PROCESSING *************
[1556222489.844437] [2048.1] [pid=14359] **** BEGIN MACRO PROCESSING ***********
[1556222489.844441] [2048.1] [pid=14359] Processing: '$_HOSTPARENT_DNS$'
[1556222489.844447] [2048.1] [pid=14359]   Done.  Final output: 'HS-402-3560-01-1517'
[1556222489.844470] [2048.1] [pid=14359] **** END MACRO PROCESSING *************
[1556222489.844474] [2048.1] [pid=14359] **** BEGIN MACRO PROCESSING ***********
[1556222489.844478] [2048.1] [pid=14359] Processing: '$USER1$/check_mac $ARG1$ $ARG2$'
[1556222489.844485] [2048.1] [pid=14359]   Done.  Final output: '/usr/local/nagios/libexec/check_mac a4:93:4c:c1:27:9f HS-402-3560-01-1517'
[1556222489.844490] [2048.1] [pid=14359] **** END MACRO PROCESSING *************
[1556222489.884200] [2048.1] [pid=14359] **** BEGIN MACRO PROCESSING ***********
[1556222489.884213] [2048.1] [pid=14359] Processing: '$_HOSTMACADDRESS$'
[1556222489.884223] [2048.1] [pid=14359]   Done.  Final output: 'a4:93:4c:b2:57:83'
[1556222489.884227] [2048.1] [pid=14359] **** END MACRO PROCESSING *************
[1556222489.884232] [2048.1] [pid=14359] **** BEGIN MACRO PROCESSING ***********
[1556222489.884236] [2048.1] [pid=14359] Processing: '$_HOSTPARENT_DNS$'
[1556222489.884242] [2048.1] [pid=14359]   Done.  Final output: 'ES-124B-2960X-01-132'
[1556222489.884246] [2048.1] [pid=14359] **** END MACRO PROCESSING *************
[1556222489.919445] [2048.1] [pid=14359] **** BEGIN MACRO PROCESSING ***********
[1556222489.919467] [2048.1] [pid=14359] Processing: '$USER1$/check_mac $ARG1$ $ARG2$'
[1556222489.919483] [2048.1] [pid=14359]   Done.  Final output: '/usr/local/nagios/libexec/check_mac  '
[1556222489.919494] [2048.1] [pid=14359] **** END MACRO PROCESSING *************
[1556222489.927713] [2048.1] [pid=14359] **** BEGIN MACRO PROCESSING ***********
[1556222489.927740] [2048.1] [pid=14359] Processing: '$USER1$/check_mac $ARG1$ $ARG2$'
[1556222489.927752] [2048.1] [pid=14359]   Done.  Final output: '/usr/local/nagios/libexec/check_mac  '
[1556222489.927758] [2048.1] [pid=14359] **** END MACRO PROCESSING *************
[1556222489.927816] [2048.1] [pid=14359] **** BEGIN MACRO PROCESSING ***********
[1556222489.927823] [2048.1] [pid=14359] Processing: '$USER1$/check_mac $ARG1$ $ARG2$'
[1556222489.927828] [2048.1] [pid=14359]   Done.  Final output: '/usr/local/nagios/libexec/check_mac  '
[1556222489.927831] [2048.1] [pid=14359] **** END MACRO PROCESSING *************
[1556222489.939980] [2048.1] [pid=14359] **** BEGIN MACRO PROCESSING ***********
[1556222489.940013] [2048.1] [pid=14359] Processing: '$USER1$/check_mac $ARG1$ $ARG2$'
[1556222489.940026] [2048.1] [pid=14359]   Done.  Final output: '/usr/local/nagios/libexec/check_mac  '
[1556222489.940032] [2048.1] [pid=14359] **** END MACRO PROCESSING *************


It appears that it works sometimes and not sometimes. What would cause the inconsistency? I have verified that the host definitions are complete for the ones missing data.

Lastly, these all reset fine from a HARD down state in the service section, but I am also using this for the host check command. They are all currently locked in a 10/10 HARD state. I can run them via CLI and, as mentioned, the service is running without error. How do I reset the host 10/10 HARD state so that they can show OK again?
bmallett
 
Posts: 16
Joined: Fri Feb 22, 2019 10:28 am

Re: Host shows down but services are ok?

Postby cdienger » Fri Apr 26, 2019 2:32 pm

I'd be curios to see what's in the /usr/local/nagios/var/objects.cache for the hosts and services with these checks to make sure that all the variables are getting set correctly. I'd also try removing the hosts and services and verify they're removed in the web ui before adding them back in.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
User avatar
cdienger
Support Tech
 
Posts: 3454
Joined: Tue Feb 07, 2017 11:26 am

Re: Host shows down but services are ok?

Postby bmallett » Fri Apr 26, 2019 2:45 pm

The HOST check shows failed but the service shows OK.

Here is a host:

Code: Select all
define host {
   host_name   AD-02-rm02-storage
   display_name   Admin Storage Room AP
   alias   Access Point in Administration Storage
   address   AD-02-rm02-storage
   parents   AD-02-3560-00-102
   check_period   24x7
   check_command   check_mac
   notification_period   24x7
   initial_state   o
   importance   0
   check_interval   2.000000
   retry_interval   1.000000
   max_check_attempts   10
   active_checks_enabled   1
   passive_checks_enabled   1
   obsess   1
   event_handler_enabled   1
   low_flap_threshold   0.000000
   high_flap_threshold   0.000000
   flap_detection_enabled   1
   flap_detection_options   a
   freshness_threshold   0
   check_freshness   0
   notification_options   a
   notifications_enabled   1
   notification_interval   30.000000
   first_notification_delay   0.000000
   stalking_options   n
   process_perf_data   1
   icon_image        Network-Access-Point.png
   icon_image_alt   Access Point
   vrml_image    Network-Access-Point.gd2
   retain_status_information   1
   retain_nonstatus_information   1
   _MACADDRESS   a4:93:4c:43:58:39
   _PARENT_DNS   AD-02-3560-00-102
   }


Here is that service:

Code: Select all
define service {
   host_name   AD-02-rm02-storage
   service_description   Get IP from MAC ADDRESS and Ping for AP Status
   check_period   24x7
   check_command   check_mac!$_HOSTMACADDRESS$!$_HOSTPARENT_DNS$
   notification_period   24x7
   initial_state   o
   importance   0
   check_interval   10.000000
   retry_interval   2.000000
   max_check_attempts   3
   is_volatile   1
   parallelize_check   1
   active_checks_enabled   1
   passive_checks_enabled   1
   obsess   1
   event_handler_enabled   1
   low_flap_threshold   0.000000
   high_flap_threshold   0.000000
   flap_detection_enabled   1[img][/img]
   flap_detection_options   a
   freshness_threshold   0
   check_freshness   0
   notification_options   a
   notifications_enabled   1
   notification_interval   30.000000
   first_notification_delay   0.000000
   stalking_options   n
   process_perf_data   1
   retain_status_information   1
   retain_nonstatus_information   1
   }



Image

https://ibb.co/Y3zH6tB
bmallett
 
Posts: 16
Joined: Fri Feb 22, 2019 10:28 am

Re: Host shows down but services are ok?

Postby bmallett » Fri Apr 26, 2019 3:00 pm

I did cut in some "helpful" messages to see where/what was failing. This one says the following:

No MAC Address supplied from HOST.


Image

https://ibb.co/vwVvV0g



SERVICE:
Image

https://ibb.co/kXFrRTs
bmallett
 
Posts: 16
Joined: Fri Feb 22, 2019 10:28 am

Re: Host shows down but services are ok?

Postby cdienger » Fri Apr 26, 2019 3:40 pm

Can you PM me a copy of the config(/usr/local/nagios/etc/) ? We haven't been able to reproduce the problem and I'd like to lab it up with your config.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
User avatar
cdienger
Support Tech
 
Posts: 3454
Joined: Tue Feb 07, 2017 11:26 am

Re: Host shows down but services are ok?

Postby cdienger » Tue Apr 30, 2019 3:29 pm

Received the data and I think the problem here is that the host template doesn't use the ARG options when defining the command. Edit /usr/local/nagios/etc/objects/templates.cfg and change:

Code: Select all
define host {
        name                            generic-access-point    ; The name of this host template
        use                             generic-host            ; Inherit default values from the generic-host template
        check_period                    24x7                    ; By default, switches are monitored round the clock
        check_interval                  2                       ; Switches are checked every 5 minutes
        retry_interval                  1                       ; Schedule host check retries at 1 minute intervals
        max_check_attempts              10                      ; Check each switch 10 times (max)
        check_command                   check_mac               ; Default command to check if access points are "alive"
        notification_period             24x7                    ; Send notifications at any time
#        notification_interval           30                      ; Resend notifications every 30 minutes
#        notification_options            d,r                     ; Only send notifications for specific host states
#        contact_groups                  admins                  ; Notifications get sent to the admins by default
        register                        0                       ; DON'T REGISTER THIS - ITS JUST A TEMPLATE
}


to:

Code: Select all
define host {
        name                            generic-access-point    ; The name of this host template
        use                             generic-host            ; Inherit default values from the generic-host template
        check_period                    24x7                    ; By default, switches are monitored round the clock
        check_interval                  2                       ; Switches are checked every 5 minutes
        retry_interval                  1                       ; Schedule host check retries at 1 minute intervals
        max_check_attempts              10                      ; Check each switch 10 times (max)
        check_command                   check_mac!$_HOSTMACADDRESS$!$_HOSTPARENT_DNS$               ; Default command to check if access points are "alive"
        notification_period             24x7                    ; Send notifications at any time
#        notification_interval           30                      ; Resend notifications every 30 minutes
#        notification_options            d,r                     ; Only send notifications for specific host states
#        contact_groups                  admins                  ; Notifications get sent to the admins by default
        register                        0                       ; DON'T REGISTER THIS - ITS JUST A TEMPLATE
}
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
User avatar
cdienger
Support Tech
 
Posts: 3454
Joined: Tue Feb 07, 2017 11:26 am

Re: Host shows down but services are ok?

Postby bmallett » Wed May 01, 2019 7:51 am

@cdienger

Correct. That resolved the issue. Thanks a bunch. I verified this this morning first thing.
bmallett
 
Posts: 16
Joined: Fri Feb 22, 2019 10:28 am

Re: Host shows down but services are ok?

Postby scottwilkerson » Wed May 01, 2019 8:43 am

bmallett wrote:@cdienger

Correct. That resolved the issue. Thanks a bunch. I verified this this morning first thing.

great!

Locking thread
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
User avatar
scottwilkerson
DevOps Engineer
 
Posts: 15398
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises

Previous

Return to Nagios Plugin Development

Who is online

Users browsing this forum: harithas and 0 guests