Page 1 of 1
False warning NagiosXI
Posted: Tue Sep 05, 2017 10:06 am
by RockerMan
Hi
NagiosXI
Installed Version: 5.4.5
Sometimes there is a false warning about the unavailability of the host, which is disabled, and the host itself is in the Acknowledged, so as not to spam the e-mail.
Code: Select all
2017-09-04 20:58:21 HOST ALERT: spb-wan-r2;DOWN;HARD;3;PING CRITICAL - Packet loss = 100%
2017-09-04 20:56:43 HOST ALERT: spb-wan-r2;DOWN;SOFT;2;PING CRITICAL - Packet loss = 100%
2017-09-04 20:55:05 HOST ALERT: spb-wan-r2;DOWN;SOFT;1;PING CRITICAL - Packet loss = 100%
2017-09-04 20:51:26 HOST ALERT: spb-wan-r2;UP;HARD;3;PING WARNING - Packet loss = 87%, RTA = 0.90 ms
Although the host is disconnected and is located in the Acknowledged, Nagios writes that he is an UP and the percentage of packet loss. After that, the Acknowledge is of course removed and spam begins to e-mail about the inaccessibility of the host.
I can not understand the reason for this behavior of Nagios in relation to the switched off host. Help please understand.
Thank you.
Re: False warning NagiosXI
Posted: Tue Sep 05, 2017 10:58 am
by scottwilkerson
Acknowledged is not the same an disabled.
When you Acknowledge a problem, it is only Acknowledged until it gets an OK status.
To get what you describe, you want to go the the host detail page, click Advanced Tab, and under "Host Attributes" click the X next to "Active Checks"
This will stop the host from being checked at all until you click it again.
Re: False warning NagiosXI
Posted: Wed Sep 06, 2017 1:23 am
by RockerMan
scottwilkerson wrote:Acknowledged is not the same an disabled.
When you Acknowledge a problem, it is only Acknowledged until it gets an OK status.
To get what you describe, you want to go the the host detail page, click Advanced Tab, and under "Host Attributes" click the X next to "Active Checks"
This will stop the host from being checked at all until you click it again.
It is definitely that "Acknowledged is not the same an disabled", sorry, this is my bad english.
It was meant that the host disconnected from power supply.
The situation is such that on a host that is disconnected from power supply, sometimes its state changes to "UP", and acknowledge is removed, then spam goes to the e-mail and sms messages to the support service, and they begin to get nervous and call everyone ...
Disabling the "Active Checks" is not exactly what you need, we set the acknowledge on the temporarily disabled host so that when it is turned on, it automatically connects to monitoring, and if i disable the "Active Checks", it will not switch to monitorng.
Re: False warning NagiosXI
Posted: Wed Sep 06, 2017 9:25 am
by scottwilkerson
If you are doing ping checks and a host is disconnected from power supply the only way for it to go to OK is if the IP is getting assigned to another computer, and you should be concerned about what is causing this
Re: False warning NagiosXI
Posted: Thu Sep 07, 2017 1:44 am
by RockerMan
scottwilkerson wrote:If you are doing ping checks and a host is disconnected from power supply the only way for it to go to OK is if the IP is getting assigned to another computer, and you should be concerned about what is causing this
The variant of using the address by another host was checked first. No one used the address of the disconnectedd router. In this case, we would receive a flip-flop from arpwatch, and no messages from arpwatch.
A uninterrupted ping for verification was launched on the disconnected host, yesterday, until all the packets were lost, there was no UP state.
Re: False warning NagiosXI
Posted: Thu Sep 07, 2017 8:50 am
by scottwilkerson
RockerMan wrote:scottwilkerson wrote:If you are doing ping checks and a host is disconnected from power supply the only way for it to go to OK is if the IP is getting assigned to another computer, and you should be concerned about what is causing this
The variant of using the address by another host was checked first. No one used the address of the disconnectedd router. In this case, we would receive a flip-flop from arpwatch, and no messages from arpwatch.
A uninterrupted ping for verification was launched on the disconnected host, yesterday, until all the packets were lost, there was no UP state.
So you are saying the state is changing to UP even though the host check isn't able to ping the host?
The only other way I see this as possible would be if you had the following in your nagios.cfg
Re: False warning NagiosXI
Posted: Thu Sep 07, 2017 9:36 am
by RockerMan
scottwilkerson wrote:
So you are saying the state is changing to UP even though the host check isn't able to ping the host?
The host was down, but somehow Nagios saw that he was in the UP.
Code: Select all
2017-09-04 20:51:26 HOST ALERT: spb-wan-r2;UP;HARD;3;PING WARNING - Packet loss = 87%, RTA = 0.90 ms
I want to understand how Nagios could get an UP if host is down and no one took the host IP address and connected it to another host.
scottwilkerson wrote:
The only other way I see this as possible would be if you had the following in your nagios.cfg
no, 1
Code: Select all
# cat /usr/local/nagios/etc/nagios.cfg | grep retain_state_information
retain_state_information=1
Re: False warning NagiosXI
Posted: Thu Sep 07, 2017 9:44 am
by scottwilkerson
RockerMan wrote:The host was down, but somehow Nagios saw that he was in the UP.
Code: Select all
2017-09-04 20:51:26 HOST ALERT: spb-wan-r2;UP;HARD;3;PING WARNING - Packet loss = 87%, RTA = 0.90 ms
I want to understand how Nagios could get an UP if host is down and no one took the host IP address and connected it to another host.
Generally I would say this is not possible, but to know for sure I would need to see the host configuration and the corresponding command configuration.
Re: False warning NagiosXI
Posted: Fri Sep 08, 2017 1:57 am
by RockerMan
Yes, I think so too.
Let's pause for now. There was one case, on 4.09.2017, after him there was no such incident. I now put the host on a uninterrupted ping, so that it was possible to check whether the response from the host was actually in the state of UP, or it is an accidentally false positive single alert.
Re: False warning NagiosXI
Posted: Fri Sep 08, 2017 8:25 am
by scottwilkerson
Ok, let us know if this comes up again