Nagios response time

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
billperrotta
Posts: 115
Joined: Fri Feb 21, 2014 11:44 am

Re: Nagios response time

Post by billperrotta »

Code: Select all

# Last Modified: 10-03-2007
#
# NOTES: This config file provides you with some example object definition
#        templates that are refered by other host, service, contact, etc.
#        definitions in other config files.
#
#        You don't need to keep these definitions in a separate file from your
#        other object definitions.  This has been done just to make things
#        easier to understand.
#
###############################################################################



###############################################################################
###############################################################################
#
# CONTACT TEMPLATES
#
###############################################################################
###############################################################################

# Generic contact definition template - This is NOT a real contact, just a template!

define contact{
        name                            generic-contact         ; The name of this contact template
        service_notification_period     24x7                    ; service notifications can be sent anytime
        host_notification_period        24x7                    ; host notifications can be sent anytime
        service_notification_options    w,u,c,r,f,s             ; send notifications for all service states, flapping events, and scheduled downtime events
        host_notification_options       d,u,r,f,s               ; send notifications for all host states, flapping events, and scheduled downtime events
        service_notification_commands   notify-service-by-email ; send service notifications via email
        host_notification_commands      notify-host-by-email    ; send host notifications via email
        register                        0                       ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL CONTACT, JUST A TEMPLATE!
        }




###############################################################################
###############################################################################
#
# HOST TEMPLATES
#
###############################################################################
###############################################################################

# Generic host definition template - This is NOT a real host, just a template!

define host{
        name                            generic-host    ; The name of this host template
        notifications_enabled           1               ; Host notifications are enabled
"templates.cfg" 211L, 11826C                                                                                                                                   54,1           1%
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Nagios response time

Post by abrist »

hmmm, either the output is truncated, or you have another file with the generic-service declaration. Can you send us a listing of your nagios object directory?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
billperrotta
Posts: 115
Joined: Fri Feb 21, 2014 11:44 am

Re: Nagios response time

Post by billperrotta »

you mean an ls listing inside the directory?

commands.cfg contacts.cfg groups.cfg nrpe_check_control.cfg sonicwalls templates.cfg windows.cfg
computers devices localhost.cfg printer.cfg switch.cfg timeperiods.cfg
User avatar
lmiltchev
Former Nagios Staff
Posts: 13587
Joined: Mon May 23, 2011 12:15 pm

Re: Nagios response time

Post by lmiltchev »

Can you double-check - is this ALL you have for the "generic-host" definition in the "templates.cfg"?

Code: Select all

define host{
        name                            generic-host    ; The name of this host template
        notifications_enabled           1               ; Host notifications are enabled
"templates.cfg" 211L, 11826C                                                                                                                                   54,1           1%
Be sure to check out our Knowledgebase for helpful articles and solutions!
billperrotta
Posts: 115
Joined: Fri Feb 21, 2014 11:44 am

Re: Nagios response time

Post by billperrotta »

looks like it

Code: Select all

vi templates.cfg
  # Last Modified: 10-03-2007
  #
  # NOTES: This config file provides you with some example object definition
  #        templates that are refered by other host, service, contact, etc.
  #        definitions in other config files.
  #
  #        You don't need to keep these definitions in a separate file from your
  #        other object definitions.  This has been done just to make things
  #        easier to understand.
  #
  ###############################################################################



  ###############################################################################
9>###############################################################################
  #
  # CONTACT TEMPLATES
2>#
  ###############################################################################
  ###############################################################################

  # Generic contact definition template - This is NOT a real contact, just a template!

  define contact{
          name                            generic-contact         ; The name of this contact template
          service_notification_period     24x7                    ; service notifications can be sent anytime
          host_notification_period        24x7                    ; host notifications can be sent anytime
          service_notification_options    w,u,c,r,f,s             ; send notifications for all service states, flapping events, and scheduled downtime events
          host_notification_options       d,u,r,f,s               ; send notifications for all host states, flapping events, and scheduled downtime events
          service_notification_commands   notify-service-by-email ; send service notifications via email
          host_notification_commands      notify-host-by-email    ; send host notifications via email
          register                        0                       ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL CONTACT, JUST A TEMPLATE!
          }




  ###############################################################################
  ###############################################################################
  #
  # HOST TEMPLATES
  #
  ###############################################################################
  ###############################################################################

  # Generic host definition template - This is NOT a real host, just a template!
1>
(>define host{
          name                            generic-host    ; The name of this host template
0>        notifications_enabled           1               ; Host notifications are enabled
"templates.cfg" 211L, 11826C         
I inherited this installation, your guess is as good as mine. I can vi whatever else you would like to view.

Is there a default interval if none are defined?
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Nagios response time

Post by Box293 »

It looks like the templates.cfg file is corrupted, there are characters I am seeing which should not be there and I cannot see the closing bracket } for the end of the generic-host definition.

Attached is a copy of the templates.cfg file from my Nagios 4.0.x server.
templates.cfg
(10.37 KiB) Downloaded 273 times
billperrotta wrote:Is there a default interval if none are defined?
If no interval is found for a host or service definition, nagios will have a dummy spit. For example:

I created this host:

Code: Select all

define host{
        use             generic-host  ; Inherit default values from a template
        host_name       test-generic   ; The name we're giving to this host
        alias           test-generic   ; A longer name associated with the host
        address         192.168.1.1   ; IP address of the host
        }
And then I verified the configuration:

Code: Select all

[nagios@nagioscore4-0-x root]$ /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

Nagios Core 4.0.7
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 06-03-2014
License: GPL

Website: http://www.nagios.org
Reading configuration data...
   Read main config file okay...
Error: Invalid max_check_attempts value for host 'test-generic'
Error: Could not register host (config file '/usr/local/nagios/etc/objects/windows.cfg', starting on line 31)
   Error processing object config files!


***> One or more problems was encountered while processing the config files...

     Check your configuration file(s) to ensure that they contain valid
     directives and data defintions.  If you are upgrading from a previous
     version of Nagios, you should be aware that some variables/definitions
     may have been removed or modified in this version.  Make sure to read
     the HTML documentation regarding the config files, as well as the
     'Whats New' section to find out what has changed.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
billperrotta
Posts: 115
Joined: Fri Feb 21, 2014 11:44 am

Re: Nagios response time

Post by billperrotta »

"Lost" what service would I define to make nagios check for downed servers more frequently?

If it is corrupted how do i go about recreating it? can you give me a sample file I can edit?

I have problems copying the full contents of a file through putty. i seem to only be able to copy what is on the screen.

i wish there was an easier way to copy a file to my windows box other than logging into the linux box directly with gui?

see example below

Code: Select all

define service{
	name				local-service 		; The name of this service template
	use				generic-service		; Inherit default values from the generic-service definition
        max_check_attempts              4			; Re-check the service up to 4 times in order to determine its final (hard) state
        normal_check_interval           5			; Check the service every 5 minutes under normal conditions
        retry_check_interval            1			; Re-check the service every minute until a hard state can be determined
        register                        0       		; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
	}
billperrotta
Posts: 115
Joined: Fri Feb 21, 2014 11:44 am

Re: Nagios response time

Post by billperrotta »

here I went from the box directly to the internet so you can download my templates.cfg directly to analyze. maybe then you can help me change my check interval?
Attachments
templates.cfg
(11.55 KiB) Downloaded 247 times
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: Nagios response time

Post by sreinhardt »

That is exactly what we needed. So here is what I see.

Templates.cfg defines:

Code: Select all

define service{
        name                            generic-service 	; The 'name' of this service template
        active_checks_enabled           1       		; Active service checks are enabled
        passive_checks_enabled          1    		   	; Passive service checks are enabled/accepted
        parallelize_check               1       		; Active service checks should be parallelized (disabling this can lead to major performance problems)
        obsess_over_service             1       		; We should obsess over this service (if necessary)
        check_freshness                 0       		; Default is to NOT check service 'freshness'
        notifications_enabled           1       		; Service notifications are enabled
        event_handler_enabled           1       		; Service event handler is enabled
        flap_detection_enabled          1       		; Flap detection is enabled
        failure_prediction_enabled      1       		; Failure prediction is enabled
        process_perf_data               1       		; Process performance data
        retain_status_information       1       		; Retain status information across program restarts
        retain_nonstatus_information    1       		; Retain non-status information across program restarts
        is_volatile                     0       		; The service is not volatile
        check_period                    24x7			; The service can be checked at any time of the day
        max_check_attempts              3			; Re-check the service up to 3 times in order to determine its final (hard) state
        normal_check_interval           10			; Check the service every 10 minutes under normal conditions
        retry_check_interval            2			; Re-check the service every two minutes until a hard state can be determined
        contact_groups                  admins			; Notifications get sent out to everyone in the 'admins' group
	notification_options		w,c,r			; Send notifications about warning, unknown, critical, and recovery events
        notification_interval           60			; Re-notify about service problems every hour
        notification_period             24x7			; Notifications can be sent out at any time
         register                        0       		; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
        }
This is the template that your initial example of check_http on an exchange server, uses to fill any config options that are not directly configured there. The main points for your present question are the Normal_check_interval(which really should be check_interval), retry_check_interval, and max_check_attempts. Between these three settings you have a maximum time of about 16 minutes and a minimum of ~6 minutes of time before a service will ever be set as hard warning or critical. The reason I can say you have these times are:

normal check interval = 10 - when in a hard state nagios will only check once every 10 minutes
max check attempts = 3 - when an issue is detected, after the initial check nagios will check 3 more times before declaring a hard state
retry interval = 2 - this changes the check interval (normally 10 min) to 2 min when a soft state change is detected.

With all that, 10 min normal check interval + ( 3 retries * 2 minutes per retry ) = 16 minutes of possible downtime before a hard state and notification is sent.

Now you can simply change this in your templates.cfg, my suggestion would be to reduce the normal check interval to 5 minutes or so. However if you do change this there, it will likely be a global modification for all services. I can say this based on other configs I have seen throughout this and other posts. So if you do not wish to have this be global, you can add any or all of these config options to the individual service configurations such as to the HTTP service for exchange-servers host group like so:

Original:

Code: Select all

define service{
        use                     generic-service
        hostgroup_name          exchange-servers
        service_description     HTTP
        check_command           check_http
        notification_interval   30
}
Modified:

Code: Select all

define service{
        use                     generic-service
        hostgroup_name          exchange-servers
        service_description     HTTP
        check_command           check_http
        notification_interval   30
        check_interval            5
        max_check_attempts  1
}
These changes would override the generic-service template and allow the HTTP checks to run with a 5 minute check interval and a single max check attempt, bringing your max "down-before-notified" time to 6-7 minutes. This would also only effect the HTTP services within the exchang-servers group, allowing you a lot more fine grained control over what is checked when.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
Locked