Hi
Can we somehow divide problems with network and problems with servers? It's seems we can not use Nagios Xi to calculate servers downtime because of the network. When it is a network problem and the server is UP - Nagios send error.
Is there any way to avoid this?
Cheers,
bb
Availability report
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Availability report
If you utilize parent child relationships in Nagios, a host will not be marked down if the hosts parent is down, it would be instead marked unreachable.
Re: Availability report
Because I still not monitor the network I have no child relationships in Nagios. Is there any possibility to have dependency which means when CPEs from X host group are not responding to pings, all servers are considered to be up ?
Thanks
bb
Thanks
bb
-
slansing
- Posts: 7698
- Joined: Mon Apr 23, 2012 4:28 pm
- Location: Travelling through time and space...
Re: Availability report
This happens as I'm sure you know because the least intensive way to check if a host/service is reachable it via ping. As all checks which are triggered from Nagios XI or sent via SNMP/Passive checks must cross the great divide of your network they would also show the host/service down, that being said.. The errors you would see due to connection issues are plugin specific, for instance you may see a NRPE error of "129 - Out of Bounds" Which would indicate the target could not be reached. In most cases this would indicate a failure at some point in the network or.. the actual Plugin could be missing from the host/service.When it is a network problem and the server is UP - Nagios send error.
What do you mean by "CPEs?"when CPEs from X host group are not responding to pings, all servers are considered to be up ?
Re: Availability report
slansing wrote:This happens as I'm sure you know because the least intensive way to check if a host/service is reachable it via ping. As all checks which are triggered from Nagios XI or sent via SNMP/Passive checks must cross the great divide of your network they would also show the host/service down, that being said.. The errors you would see due to connection issues are plugin specific, for instance you may see a NRPE error of "129 - Out of Bounds" Which would indicate the target could not be reached. In most cases this would indicate a failure at some point in the network or.. the actual Plugin could be missing from the host/service.When it is a network problem and the server is UP - Nagios send error.
I have a drone server (vmware ESXi 4.1) and other servers in each country. Can I consider UP the drone server when all CPEs are down for country X ?
What do you mean by "CPEs?"when CPEs from X host group are not responding to pings, all servers are considered to be up ?
Customer Premises Equipment - Communications equipment that resides on the customer's premises.
Re: Availability report
Just to be clear, you want unreachable machines to be marked as up rather than unreachable?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Re: Availability report
Yes, but only if all machines from that group are down. It is unlikely that all machines of the same group to be down at the same time, most probably when Nagios notify that all machines from the same group are down there is a network problem and I don't want that downtime in my availability reports for machines.abrist wrote:Just to be clear, you want unreachable machines to be marked as up rather than unreachable?
Re: Availability report
To be more clear : if I have for example 4 servers in Slovakia and for each of them I'll define the other three as a parent, when the network will be down, Nagios will report problems for all servers but it'll not record the downtime in the availability report becuase always the parents will be down. I'm right?
Re: Availability report
They will not be marked as down, just unreachable. There is no way for unreachable machines to be marked as "up" as they are not being checked. If nagios had a way to do mark machines as "up" when they were unreachable, it would be a historical inaccuracy and data fidelity problem with XI.Bogdan_B wrote:To be more clear : if I have for example 4 servers in Slovakia and for each of them I'll define the other three as a parent, when the network will be down, Nagios will report problems for all servers but it'll not record the downtime in the availability report becuase always the parents will be down. I'm right?
Remember, the availability report will not show those systems as "down", and an unreachable state does not imply that the unreachable hosts are "down" - what it does imply is that a network outage has occurred and XI will report it thusly.
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Re: Availability report
It is clear now !Thank you very much !