PROBLEM Host Alert - Website

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
ecolgroveMOT
Posts: 64
Joined: Thu Aug 22, 2019 1:58 pm

PROBLEM Host Alert - Website

Post by ecolgroveMOT »

I am new to this monitoring part of Nagios, and have about three or four websites i am monitoring and I get this alert fort just one of them a lot:

***** Nagios XI Alert *****

Nagios has detected a problem with this host.

Notification Type: PROBLEM
Host: Website - chasebrexton.myezyaccess.com
State: DOWN
Address: chasebrexton.myezyaccess.com
Info: CRITICAL - Socket timeout

but the website is not down and there never seems to be an issue. Currently, the configuration is set to check one time and alert right away in case of any down time, obliviously, I could change that, but if the website should actually go down we would not know.
User avatar
mbellerue
Posts: 1403
Joined: Fri Jul 12, 2019 11:10 am

Re: PROBLEM Host Alert - Website

Post by mbellerue »

How often do you check to make sure the site is up? The first thing that comes to my mind is something like an aggressive IDS/IPS that is seeing the Nagios check as a possible attack on the site, and then blocking Nagios from connecting to the site.

When you get an alert, how long does it stay critical before going OK again?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
ecolgroveMOT
Posts: 64
Joined: Thu Aug 22, 2019 1:58 pm

Re: PROBLEM Host Alert - Website

Post by ecolgroveMOT »

The check settings are:
Check interval: 5 min
Retry interval: 1 min
Max check attempts: 1 attempts
ecolgroveMOT
Posts: 64
Joined: Thu Aug 22, 2019 1:58 pm

Re: PROBLEM Host Alert - Website

Post by ecolgroveMOT »

Nagios will stay critical for about five to ten mins, it is not long.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: PROBLEM Host Alert - Website

Post by scottwilkerson »

ecolgroveMOT wrote:Nagios will stay critical for about five to ten mins, it is not long.
It sounds like Nagios is doing what it is supposed to do, it is alerting you when this server cannot be reached. This could be because of a networking outage or other, but from the Nagios servers perspective it is getting a Socket timeout when trying to run the plugin you have setup for this.

If this also happens where multiple hosts/services have similar error or SNMP checks do not receive a response (like your other post https://support.nagios.com/forum/viewto ... =6&t=56658 ), it could be that you have a networking issue that keeps reoccurring, and NAgios is going to have the same problem until it is resolved.
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
ecolgroveMOT
Posts: 64
Joined: Thu Aug 22, 2019 1:58 pm

Re: PROBLEM Host Alert - Website

Post by ecolgroveMOT »

is it better to use the snmp option or ns client option? Is there a way to resolve this?
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: PROBLEM Host Alert - Website

Post by scottwilkerson »

ecolgroveMOT wrote:is it better to use the snmp option or ns client option? Is there a way to resolve this?
It depends. Using SNMP you are using UDP and if the packet is dropped or if there is an error connecting you are going to get the same error, which makes things difficult.

If it is in fact cause by a network problem as I suspect, tracking that down and fixing it would resolve the issue, there would be no way for Nagios to compensate other that to use a longer "Max check attempts" allowing the service to recover before sending a notification. This is the primary reason this setting exists.
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
ecolgroveMOT
Posts: 64
Joined: Thu Aug 22, 2019 1:58 pm

Re: PROBLEM Host Alert - Website

Post by ecolgroveMOT »

Do you suggest setting the max attempts higher?
I know when setting up a new alert Nagios has it as default set to 5, should this be what is recommended?
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: PROBLEM Host Alert - Website

Post by scottwilkerson »

ecolgroveMOT wrote:Do you suggest setting the max attempts higher?
I know when setting up a new alert Nagios has it as default set to 5, should this be what is recommended?
Personally I like the following defaults:
Check interval: 5 min
Retry interval: 1 min
Max check attempts: 5 attempts

With these one something comes back as non-OK, you get 5 attempts at 1 minute intervals for the service/network/etc to come back around before sending a notification.
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
Locked