Availability report

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Bogdan_B
Posts: 34
Joined: Fri Aug 03, 2012 1:56 am

Availability report

Post by Bogdan_B »

Hi

Can we somehow divide problems with network and problems with servers? It's seems we can not use Nagios Xi to calculate servers downtime because of the network. When it is a network problem and the server is UP - Nagios send error.

Is there any way to avoid this?

Cheers,
bb
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Availability report

Post by scottwilkerson »

If you utilize parent child relationships in Nagios, a host will not be marked down if the hosts parent is down, it would be instead marked unreachable.
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Bogdan_B
Posts: 34
Joined: Fri Aug 03, 2012 1:56 am

Re: Availability report

Post by Bogdan_B »

Because I still not monitor the network I have no child relationships in Nagios. Is there any possibility to have dependency which means when CPEs from X host group are not responding to pings, all servers are considered to be up ?

Thanks
bb
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: Availability report

Post by slansing »

When it is a network problem and the server is UP - Nagios send error.
This happens as I'm sure you know because the least intensive way to check if a host/service is reachable it via ping. As all checks which are triggered from Nagios XI or sent via SNMP/Passive checks must cross the great divide of your network they would also show the host/service down, that being said.. The errors you would see due to connection issues are plugin specific, for instance you may see a NRPE error of "129 - Out of Bounds" Which would indicate the target could not be reached. In most cases this would indicate a failure at some point in the network or.. the actual Plugin could be missing from the host/service.

when CPEs from X host group are not responding to pings, all servers are considered to be up ?
What do you mean by "CPEs?"
Bogdan_B
Posts: 34
Joined: Fri Aug 03, 2012 1:56 am

Re: Availability report

Post by Bogdan_B »

slansing wrote:
When it is a network problem and the server is UP - Nagios send error.
This happens as I'm sure you know because the least intensive way to check if a host/service is reachable it via ping. As all checks which are triggered from Nagios XI or sent via SNMP/Passive checks must cross the great divide of your network they would also show the host/service down, that being said.. The errors you would see due to connection issues are plugin specific, for instance you may see a NRPE error of "129 - Out of Bounds" Which would indicate the target could not be reached. In most cases this would indicate a failure at some point in the network or.. the actual Plugin could be missing from the host/service.

I have a drone server (vmware ESXi 4.1) and other servers in each country. Can I consider UP the drone server when all CPEs are down for country X ?

when CPEs from X host group are not responding to pings, all servers are considered to be up ?
What do you mean by "CPEs?"
Customer Premises Equipment - Communications equipment that resides on the customer's premises.
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Availability report

Post by abrist »

Just to be clear, you want unreachable machines to be marked as up rather than unreachable?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Bogdan_B
Posts: 34
Joined: Fri Aug 03, 2012 1:56 am

Re: Availability report

Post by Bogdan_B »

abrist wrote:Just to be clear, you want unreachable machines to be marked as up rather than unreachable?
Yes, but only if all machines from that group are down. It is unlikely that all machines of the same group to be down at the same time, most probably when Nagios notify that all machines from the same group are down there is a network problem and I don't want that downtime in my availability reports for machines.
Bogdan_B
Posts: 34
Joined: Fri Aug 03, 2012 1:56 am

Re: Availability report

Post by Bogdan_B »

To be more clear : if I have for example 4 servers in Slovakia and for each of them I'll define the other three as a parent, when the network will be down, Nagios will report problems for all servers but it'll not record the downtime in the availability report becuase always the parents will be down. I'm right?
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Availability report

Post by abrist »

Bogdan_B wrote:To be more clear : if I have for example 4 servers in Slovakia and for each of them I'll define the other three as a parent, when the network will be down, Nagios will report problems for all servers but it'll not record the downtime in the availability report becuase always the parents will be down. I'm right?
They will not be marked as down, just unreachable. There is no way for unreachable machines to be marked as "up" as they are not being checked. If nagios had a way to do mark machines as "up" when they were unreachable, it would be a historical inaccuracy and data fidelity problem with XI.

Remember, the availability report will not show those systems as "down", and an unreachable state does not imply that the unreachable hosts are "down" - what it does imply is that a network outage has occurred and XI will report it thusly.
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Bogdan_B
Posts: 34
Joined: Fri Aug 03, 2012 1:56 am

Re: Availability report

Post by Bogdan_B »

It is clear now !Thank you very much !
Locked