Page 1 of 2

Nagios reporting host down but its not?

Posted: Thu Feb 13, 2020 11:04 am
by Alan
I have been getting a few issues with Nagios sending an email saying a server is down but it actually is not. I am able to login to it. I am thinking this may have to do with my configuration but I am not sure. I am using ncpa.cfg and I have a Ping service setup there I am pretty sure it was just the default which is:

Code: Select all

define service {
    host_name               Svr-Data
    service_description     Ping
    check_command           check_ping!60.0,5%!100.0,10%
    max_check_attempts      5
    check_interval          5
    retry_interval          1
    check_period            24x7
    notification_interval   60
    notification_period     24x7
    contact_groups          admins2
    register                1
}
I did find a form and someone suggested to look at the command.cfg file and said it should look like this:

Code: Select all

define command{
command_name check-host-alive
command_line $USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5
}
This is exactly how mine looks. I don't know if i need to change any values in one or both of these spots?

I have also been seeing some weird issues in the Nagios UI that is basically saying the status is down but all the services are ok. I have seen this several times and tired reloading the Nagios service and the httpd service but it does not seem to fix it. This does eventually go away on its own and change to status up but it sometimes takes a day or so.

Re: Nagios reporting host down but its not?

Posted: Thu Feb 13, 2020 1:28 pm
by scottwilkerson
You show the configuration for a service, but what is the configuration for the host? it is the check_command for the host definition that will determine if it is marked down

Re: Nagios reporting host down but its not?

Posted: Fri Feb 14, 2020 12:08 pm
by Alan
Here is the host setting.

Code: Select all

define host {
    host_name               Svr-Data
    address                 172.16.10.4
    hostgroups              VMs, physical_VMs
    check_command           check_ncpa!-t 'PublicS' -P 5693 -M system/agent_version
    max_check_attempts      5
    check_interval          5
    retry_interval          1
    check_period            24x7
    contact_groups          admins2, calls
    notification_interval   60
    notification_period     24x7
    notifications_enabled   1
    notification_options    d,u,r
    icon_image              ncpa.png
    statusmap_image         ncpa.png
    register                1
}
Is this what you are wanting?

Re: Nagios reporting host down but its not?

Posted: Fri Feb 14, 2020 12:21 pm
by scottwilkerson
Alan wrote:Is this what you are wanting?
Yes, and the Svr-Data shows down in the UI?

What is the output shown on the host status page?

Re: Nagios reporting host down but its not?

Posted: Fri Feb 14, 2020 1:28 pm
by Alan
This is the Svr-data Nagios UI

Re: Nagios reporting host down but its not?

Posted: Fri Feb 14, 2020 2:20 pm
by scottwilkerson
Click on Svr-Data under the host column to see the host status detail page

Re: Nagios reporting host down but its not?

Posted: Tue Feb 18, 2020 12:54 pm
by Alan
Sorry here is the Host State Information for Svr-Data.

Re: Nagios reporting host down but its not?

Posted: Tue Feb 18, 2020 1:11 pm
by scottwilkerson
That host is reporting UP

Re: Nagios reporting host down but its not?

Posted: Tue Feb 18, 2020 3:10 pm
by Alan
Ya sorry for any misunderstanding. It show up there has just been a few instances that I got a email from Nagois that said it was down. So I would login to the server and it was not down I was able to login to it. So am just trying to find out why it said it was down when it was not. I have had this same thing happen also on a few other servers.

Re: Nagios reporting host down but its not?

Posted: Tue Feb 18, 2020 3:12 pm
by scottwilkerson
Are you sure it didn't go down (or lose connectivity) and then recover before you could check?