Page 1 of 1

Re: [Nagios-devel] Improving the host logic

Posted: Wed Dec 14, 2005 2:47 pm
by Guest
Sounds great. Having that level of granularity would directly solve a
topology problem I need to find a work around for right now.

Cheers,

/eli

Shane Stixrud wrote:
> Nagios's host parent logic is good but it could be a whole lot better
> for todays switched networks. There has been a couple of
> recommendations in the past on how to improve this.
>
> 1) Allow nagios admins to change parent logic failure detection in cases
> where one parent is up but others are down. By default nagios treats
> multiple parents as redundant paths and thus does not suppress
> notification in situations where at least one parent is OK.
>
> The main disadvantage to this proposal is nagios rightly treats parents
> as directly connected HOPs on the path back to nagios. This work around
> would treat switches and routers as peers when they are not, removing
> the possibility of redundancy detection and easily determining which
> device is at fault.
>
> 2) Allow the nagios admins to assign a weighted priority to each host
> and have a system that allows the admin to tune these values to suppress
> notification where appropriate.
>
> This type of solution in IMO is way more complex than is required, the
> best part of the current solution is its simple to management and
> obvious to deploy.
>
> The main problem with the existing solution is modern switched networks
> often have A LOT of managed nodes connected to one or more layer2
> switches in the same layer3 network. Ideally nagios would allow admins
> to suppress notification for both devices behind both layer2 devices and
> layer3 interfaces. With that in mind I believe there is a relatively
> easy solution that stays true to nagios's current parent model while
> still meeting this challenge.
>
> The existing parent logic should be able to remain pretty much as is,
> merely renaming the directive to "l3parents" to distinguish this should
> only be used for layer 3 parents.
>
> Duplicating the existing parents logic and assigning it a new name
> called l2parents. Nagios would then need to be modified to first check
> l2parents before proceeding to the l3parents when a device goes into a
> NON-OK state. If all l2 parents or l3 parents are down nagios would
> follow the l2 or l3 inherited parents just as it does today.
>
> IMO this change would be the least intrusive, adds layer2 parent support
> and allows for redundancy detection for both layer2 and layer3 devices
> with little added complexity.
>
> Side note: The 3d map should show the layer2 parents as being directly
> connected to the child device. The l3parents should only connected to
> devices where their layer2 and layer3 parents are the same NAME/IP. In
> this way you would see a server connected to a switch that is in turn
> connected to another switch which then connects to the layer3 device,
> which so happens is how the physical connectivity IS setup in reality.
>
> Cheers,
> Shane
>
>
> -------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc. Do you grep through log
> files
> for problems? Stop! Download the new AJAX search engine that makes
> searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
> http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
> _______________________________________________
> Nagios-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/lis ... gios-devel
>






This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]