Nagios reporting host down but its not?
Posted: Thu Feb 13, 2020 11:04 am
I have been getting a few issues with Nagios sending an email saying a server is down but it actually is not. I am able to login to it. I am thinking this may have to do with my configuration but I am not sure. I am using ncpa.cfg and I have a Ping service setup there I am pretty sure it was just the default which is:
I did find a form and someone suggested to look at the command.cfg file and said it should look like this:
This is exactly how mine looks. I don't know if i need to change any values in one or both of these spots?
I have also been seeing some weird issues in the Nagios UI that is basically saying the status is down but all the services are ok. I have seen this several times and tired reloading the Nagios service and the httpd service but it does not seem to fix it. This does eventually go away on its own and change to status up but it sometimes takes a day or so.
Code: Select all
define service {
host_name Svr-Data
service_description Ping
check_command check_ping!60.0,5%!100.0,10%
max_check_attempts 5
check_interval 5
retry_interval 1
check_period 24x7
notification_interval 60
notification_period 24x7
contact_groups admins2
register 1
}
Code: Select all
define command{
command_name check-host-alive
command_line $USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5
}
I have also been seeing some weird issues in the Nagios UI that is basically saying the status is down but all the services are ok. I have seen this several times and tired reloading the Nagios service and the httpd service but it does not seem to fix it. This does eventually go away on its own and change to status up but it sometimes takes a day or so.