Page 1 of 2
Availability report
Posted: Fri Jan 25, 2013 4:05 am
by Bogdan_B
Hi
Can we somehow divide problems with network and problems with servers? It's seems we can not use Nagios Xi to calculate servers downtime because of the network. When it is a network problem and the server is UP - Nagios send error.
Is there any way to avoid this?
Cheers,
bb
Re: Availability report
Posted: Fri Jan 25, 2013 10:38 am
by scottwilkerson
If you utilize parent child relationships in Nagios, a host will not be marked down if the hosts parent is down, it would be instead marked unreachable.
Re: Availability report
Posted: Mon Jan 28, 2013 3:55 am
by Bogdan_B
Because I still not monitor the network I have no child relationships in Nagios. Is there any possibility to have dependency which means when CPEs from X host group are not responding to pings, all servers are considered to be up ?
Thanks
bb
Re: Availability report
Posted: Mon Jan 28, 2013 10:22 am
by slansing
When it is a network problem and the server is UP - Nagios send error.
This happens as I'm sure you know because the least intensive way to check if a host/service is reachable it via ping. As all checks which are triggered from Nagios XI or sent via SNMP/Passive checks must cross the great divide of your network they would also show the host/service down, that being said.. The errors you would see due to connection issues are plugin specific, for instance you may see a NRPE error of "129 - Out of Bounds" Which would indicate the target could not be reached. In most cases this would indicate a failure at some point in the network or.. the actual Plugin could be missing from the host/service.
when CPEs from X host group are not responding to pings, all servers are considered to be up ?
What do you mean by "CPEs?"
Re: Availability report
Posted: Tue Jan 29, 2013 3:45 am
by Bogdan_B
slansing wrote:When it is a network problem and the server is UP - Nagios send error.
This happens as I'm sure you know because the least intensive way to check if a host/service is reachable it via ping. As all checks which are triggered from Nagios XI or sent via SNMP/Passive checks must cross the great divide of your network they would also show the host/service down, that being said.. The errors you would see due to connection issues are plugin specific, for instance you may see a NRPE error of "129 - Out of Bounds" Which would indicate the target could not be reached. In most cases this would indicate a failure at some point in the network or.. the actual Plugin could be missing from the host/service.
I have a drone server (vmware ESXi 4.1) and other servers in each country. Can I consider UP the drone server when all CPEs are down for country X ?
when CPEs from X host group are not responding to pings, all servers are considered to be up ?
What do you mean by "CPEs?"
Customer Premises Equipment - Communications equipment that resides on the customer's premises.
Re: Availability report
Posted: Tue Jan 29, 2013 12:07 pm
by abrist
Just to be clear, you want unreachable machines to be marked as up rather than unreachable?
Re: Availability report
Posted: Wed Jan 30, 2013 3:56 am
by Bogdan_B
abrist wrote:Just to be clear, you want unreachable machines to be marked as up rather than unreachable?
Yes, but only if all machines from that group are down. It is unlikely that all machines of the same group to be down at the same time, most probably when Nagios notify that all machines from the same group are down there is a network problem and I don't want that downtime in my availability reports for machines.
Re: Availability report
Posted: Wed Jan 30, 2013 9:38 am
by Bogdan_B
To be more clear : if I have for example 4 servers in Slovakia and for each of them I'll define the other three as a parent, when the network will be down, Nagios will report problems for all servers but it'll not record the downtime in the availability report becuase always the parents will be down. I'm right?
Re: Availability report
Posted: Wed Jan 30, 2013 10:31 am
by abrist
Bogdan_B wrote:To be more clear : if I have for example 4 servers in Slovakia and for each of them I'll define the other three as a parent, when the network will be down, Nagios will report problems for all servers but it'll not record the downtime in the availability report becuase always the parents will be down. I'm right?
They will not be marked as down, just unreachable. There is no way for unreachable machines to be marked as "up" as they are not being checked. If nagios had a way to do mark machines as "up" when they were unreachable, it would be a historical inaccuracy and data fidelity problem with XI.
Remember, the availability report will not show those systems as "down", and an unreachable state does not imply that the unreachable hosts are "down" - what it does imply is that a network outage has occurred and XI will report it thusly.
Re: Availability report
Posted: Wed Jan 30, 2013 11:33 am
by Bogdan_B
It is clear now !Thank you very much !