Nagios Core 4.3.4 Host frquently up/down
-
- Posts: 211
- Joined: Tue Dec 27, 2016 3:12 am
Nagios Core 4.3.4 Host frquently up/down
We installed Nagios Core 4.3.4 and configured 380 host with passive agent.
We observed host are automatically down after 10min and again up after 10sec.
Please help to solve this.
We observed host are automatically down after 10min and again up after 10sec.
Please help to solve this.
-
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Nagios Core 4.3.4 Host frquently up/down
Can you share your hosts config?
Are you sending host results?
How frequently are you sending host results?
How are you sending them?
Are you sending host results?
How frequently are you sending host results?
How are you sending them?
-
- Posts: 211
- Joined: Tue Dec 27, 2016 3:12 am
Re: Nagios Core 4.3.4 Host frquently up/down
In host alert found error,
Caught SIGTERM, shutting down.
Please explain on this how this resolve.
Caught SIGTERM, shutting down.
Please explain on this how this resolve.
-
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Nagios Core 4.3.4 Host frquently up/down
Can you answer any of these questions?
scottwilkerson wrote:Can you share your hosts config?
Are you sending host results?
How frequently are you sending host results?
How are you sending them?
I line like this in the nagios.log or system log is normal to see every time nagios restartssandeepatil wrote:Caught SIGTERM, shutting down.
Code: Select all
[1541012814] Caught SIGTERM, shutting down...
-
- Posts: 211
- Joined: Tue Dec 27, 2016 3:12 am
Re: Nagios Core 4.3.4 Host frquently up/down
hosts config
This is nagios.log and showing alert history with out restarting nagios service
templests.cfgdefine host{
host_name core_abc.com
use passive_host
address core_abc.com
event_handler Trigger
register 1
}
[
We have configured with passive checkdefine contact{
name generic-contact ; The name of this contact template
service_notification_period 24x7 ; service notifications can be sent anytime
host_notification_period 24x7 ; host notifications can be sent anytime
service_notification_options w,u,c,r ; send notifications for all service states
host_notification_options d,u ; send notifications for all host state
host_notification_commands host_trap_command,host_trap_command_2 ; send host notifications via email
service_notification_commands service_trap_command,service_trap_command_2 ; send service notifications via email
contact_groups admins
register 0 ; JUST A TEMPLATE!
}
###############################################################################
###############################################################################
#
# HOST TEMPLATES
#
###############################################################################
###############################################################################
# Generic host definition template - This is NOT a real host, just a template!
define host{
name generic-host ; The name of this host template
use host-pnp
notifications_enabled 1 ; Host notifications are enabled
event_handler_enabled 1 ; Host event handler is enabled
flap_detection_enabled 0 ; Flap detection is enabled
process_perf_data 1 ; Process performance data
retain_status_information 1 ; Retain status information across program restarts
retain_nonstatus_information 1 ; Retain non-status information across program restarts
notification_period 24x7 ; Send host notifications at any time
contact_groups admins
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
}
###############################################################################
define host{
name linux-server ; The name of this host template
use generic-host ; This template inherits other values from the generic-host template
check_period 24x7 ; By default, Linux hosts are checked round the clock
check_interval 5 ; Actively check the host every 5 minutes
retry_interval 1 ; Schedule host check retries at 1 minute intervals
max_check_attempts 10 ; Check each Linux host 10 times (max)
check_command check-host-alive ; Default command to check Linux hosts
notification_period workhours ; Linux admins hate to be woken up, so we only notify during the day
; Note that the notification_period variable is being overridden from
; the value that is inherited from the generic-host template!
notification_interval 120 ; Resend notifications every 2 hours
notification_options d,u,r ; Only send notifications for specific host states
contact_groups admins ; Notifications get sent to the admins by default
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
}
###############################################################################
#
# CORE PASSIVE HOST TEMPLATE
#
###############################################################################
define host{
name core_passive_host
check_command core_check_dummy!2!"Missing NagiosAgent Heartbeat from $HOSTNAME$ | AVAILABILITY DOWN=2"!!!!!!
use generic-host
max_check_attempts 999999
check_interval 30
retry_interval 30
active_checks_enabled 1
passive_checks_enabled 1
check_freshness 1
freshness_threshold 600
register 0
}
##########################################################################
# SERVICE TEMPLATE
#########################################################################
define service{
name local-service ; The name of this service template
use core_generic_service ; Inherit default values from the generic-service definition
max_check_attempts 4 ; Re-check the service up to 4 times in order to determine its final (hard) state
check_interval 5 ; Check the service every 5 minutes under normal conditions
retry_interval 1 ; Re-check the service every minute until a hard state can be determined
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
}
##########################################################################
define service {
name core_generic_service
use service-pnp
is_volatile 0
max_check_attempts 3
check_interval 10
retry_interval 2
active_checks_enabled 0
passive_checks_enabled 1
check_period 24x7
parallelize_check 1
obsess_over_service 1
check_freshness 0
event_handler_enabled 1
flap_detection_enabled 0
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
notification_interval 60
notification_period 24x7
notification_options w,u,c,r
notifications_enabled 1
contact_groups admins
register 0
}
define service {
name core_passive_service
service_description Passive Service
use core_generic_service
check_command core_check_dummy!0!"No data received yet."!!!!!!
is_volatile 1
initial_state o
max_check_attempts 1
active_checks_enabled 0
passive_checks_enabled 1
flap_detection_enabled 0
stalking_options o,w,c,u,
register 0
}
This is nagios.log and showing alert history with out restarting nagios service
Caught SIGTERM, shutting down...
-
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Nagios Core 4.3.4 Host frquently up/down
If you are seeing this in the nagios.log
Something or someone is in fact restarting the nagios service.
You can test this yourself, run 2 terminals, in one run the following command
in the other restart the nagios service
As soon as you run the command in the 2nd terminal you will see the logged line in the 1st terminal..
You would see the same message is someone or something just stopped the service
Code: Select all
Caught SIGTERM, shutting down...
You can test this yourself, run 2 terminals, in one run the following command
Code: Select all
tail -f /usr/local/nagios/var/nagios.log|grep SIGTERM
Code: Select all
service nagios restart
You would see the same message is someone or something just stopped the service
Code: Select all
service nagios stop
-
- Posts: 211
- Joined: Tue Dec 27, 2016 3:12 am
Re: Nagios Core 4.3.4 Host frquently up/down
Ok found one script restarting the nagios service.
What about host up/down
What about host up/down
-
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Nagios Core 4.3.4 Host frquently up/down
We would need answers to the above questionsscottwilkerson wrote:Can you share your hosts config?
Are you sending host results?
How frequently are you sending host results?
How are you sending them?
-
- Posts: 211
- Joined: Tue Dec 27, 2016 3:12 am
Re: Nagios Core 4.3.4 Host frquently up/down
We done the change in templates.cfg to resolve this issue.
Please close this thread.
Please close this thread.
-
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Nagios Core 4.3.4 Host frquently up/down
great, glad to hear it is resolved.sandeepatil wrote:We done the change in templates.cfg to resolve this issue.
Please close this thread.
Locking