Nagios Core 4.3.4 Host frquently up/down

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
sandeepatil
Posts: 211
Joined: Tue Dec 27, 2016 3:12 am

Nagios Core 4.3.4 Host frquently up/down

Post by sandeepatil »

We installed Nagios Core 4.3.4 and configured 380 host with passive agent.

We observed host are automatically down after 10min and again up after 10sec.

Please help to solve this.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Nagios Core 4.3.4 Host frquently up/down

Post by scottwilkerson »

Can you share your hosts config?

Are you sending host results?

How frequently are you sending host results?

How are you sending them?
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
sandeepatil
Posts: 211
Joined: Tue Dec 27, 2016 3:12 am

Re: Nagios Core 4.3.4 Host frquently up/down

Post by sandeepatil »

In host alert found error,

Caught SIGTERM, shutting down.

Please explain on this how this resolve.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Nagios Core 4.3.4 Host frquently up/down

Post by scottwilkerson »

Can you answer any of these questions?
scottwilkerson wrote:Can you share your hosts config?

Are you sending host results?

How frequently are you sending host results?

How are you sending them?
sandeepatil wrote:Caught SIGTERM, shutting down.
I line like this in the nagios.log or system log is normal to see every time nagios restarts

Code: Select all

[1541012814] Caught SIGTERM, shutting down...
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
sandeepatil
Posts: 211
Joined: Tue Dec 27, 2016 3:12 am

Re: Nagios Core 4.3.4 Host frquently up/down

Post by sandeepatil »

hosts config
define host{
host_name core_abc.com
use passive_host
address core_abc.com
event_handler Trigger
register 1
}
[
templests.cfg
define contact{
name generic-contact ; The name of this contact template
service_notification_period 24x7 ; service notifications can be sent anytime
host_notification_period 24x7 ; host notifications can be sent anytime
service_notification_options w,u,c,r ; send notifications for all service states
host_notification_options d,u ; send notifications for all host state
host_notification_commands host_trap_command,host_trap_command_2 ; send host notifications via email
service_notification_commands service_trap_command,service_trap_command_2 ; send service notifications via email
contact_groups admins
register 0 ; JUST A TEMPLATE!
}

###############################################################################
###############################################################################
#
# HOST TEMPLATES
#
###############################################################################
###############################################################################

# Generic host definition template - This is NOT a real host, just a template!

define host{
name generic-host ; The name of this host template
use host-pnp
notifications_enabled 1 ; Host notifications are enabled
event_handler_enabled 1 ; Host event handler is enabled
flap_detection_enabled 0 ; Flap detection is enabled
process_perf_data 1 ; Process performance data
retain_status_information 1 ; Retain status information across program restarts
retain_nonstatus_information 1 ; Retain non-status information across program restarts
notification_period 24x7 ; Send host notifications at any time
contact_groups admins
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
}


###############################################################################

define host{
name linux-server ; The name of this host template
use generic-host ; This template inherits other values from the generic-host template
check_period 24x7 ; By default, Linux hosts are checked round the clock
check_interval 5 ; Actively check the host every 5 minutes
retry_interval 1 ; Schedule host check retries at 1 minute intervals
max_check_attempts 10 ; Check each Linux host 10 times (max)
check_command check-host-alive ; Default command to check Linux hosts
notification_period workhours ; Linux admins hate to be woken up, so we only notify during the day
; Note that the notification_period variable is being overridden from
; the value that is inherited from the generic-host template!
notification_interval 120 ; Resend notifications every 2 hours
notification_options d,u,r ; Only send notifications for specific host states
contact_groups admins ; Notifications get sent to the admins by default
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
}






###############################################################################
#
# CORE PASSIVE HOST TEMPLATE
#
###############################################################################


define host{
name core_passive_host
check_command core_check_dummy!2!"Missing NagiosAgent Heartbeat from $HOSTNAME$ | AVAILABILITY DOWN=2"!!!!!!
use generic-host
max_check_attempts 999999
check_interval 30
retry_interval 30
active_checks_enabled 1
passive_checks_enabled 1
check_freshness 1
freshness_threshold 600
register 0
}






##########################################################################
# SERVICE TEMPLATE
#########################################################################
define service{
name local-service ; The name of this service template
use core_generic_service ; Inherit default values from the generic-service definition
max_check_attempts 4 ; Re-check the service up to 4 times in order to determine its final (hard) state
check_interval 5 ; Check the service every 5 minutes under normal conditions
retry_interval 1 ; Re-check the service every minute until a hard state can be determined
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
}





##########################################################################

define service {
name core_generic_service
use service-pnp
is_volatile 0
max_check_attempts 3
check_interval 10
retry_interval 2
active_checks_enabled 0
passive_checks_enabled 1
check_period 24x7
parallelize_check 1
obsess_over_service 1
check_freshness 0
event_handler_enabled 1
flap_detection_enabled 0
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
notification_interval 60
notification_period 24x7
notification_options w,u,c,r
notifications_enabled 1
contact_groups admins
register 0

}





define service {
name core_passive_service
service_description Passive Service
use core_generic_service
check_command core_check_dummy!0!"No data received yet."!!!!!!
is_volatile 1
initial_state o
max_check_attempts 1
active_checks_enabled 0
passive_checks_enabled 1
flap_detection_enabled 0
stalking_options o,w,c,u,
register 0

}
We have configured with passive check

This is nagios.log and showing alert history with out restarting nagios service
Caught SIGTERM, shutting down...
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Nagios Core 4.3.4 Host frquently up/down

Post by scottwilkerson »

If you are seeing this in the nagios.log

Code: Select all

Caught SIGTERM, shutting down...
Something or someone is in fact restarting the nagios service.

You can test this yourself, run 2 terminals, in one run the following command

Code: Select all

tail -f /usr/local/nagios/var/nagios.log|grep SIGTERM
in the other restart the nagios service

Code: Select all

service nagios restart
As soon as you run the command in the 2nd terminal you will see the logged line in the 1st terminal..

You would see the same message is someone or something just stopped the service

Code: Select all

service nagios stop
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
sandeepatil
Posts: 211
Joined: Tue Dec 27, 2016 3:12 am

Re: Nagios Core 4.3.4 Host frquently up/down

Post by sandeepatil »

Ok found one script restarting the nagios service.

What about host up/down
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Nagios Core 4.3.4 Host frquently up/down

Post by scottwilkerson »

scottwilkerson wrote:Can you share your hosts config?

Are you sending host results?

How frequently are you sending host results?

How are you sending them?
We would need answers to the above questions
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
sandeepatil
Posts: 211
Joined: Tue Dec 27, 2016 3:12 am

Re: Nagios Core 4.3.4 Host frquently up/down

Post by sandeepatil »

We done the change in templates.cfg to resolve this issue.

Please close this thread.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Nagios Core 4.3.4 Host frquently up/down

Post by scottwilkerson »

sandeepatil wrote:We done the change in templates.cfg to resolve this issue.

Please close this thread.
great, glad to hear it is resolved.

Locking
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
Locked