check_icmp flawed

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
odino2016
Posts: 9
Joined: Mon Jun 20, 2016 3:53 am

check_icmp flawed

Post by odino2016 »

We have a rather large installation with hunderds of hosts and thousands of checks. Sometimes check_icmp is reporting OK checking hosts actually down. After a lot of investigation we downloaded the source code of check_icmp. It has been designed to accomplish multiple tasks and it is not stricly checking icmp echo replies. It uses raw sockets and captures all icmp incoming packets. It uses the process PID (which is a field of the icmp packet) to identify its own echo replies. Unfortunately the pid_t structures it uses its a signed short int. So, with hundreds of ongoing ping checks, it seems that sometimes there can be 2 check_icmp processes with two PIDs that converted to pid_t are the same signed int. Unfortunately check_icmp is NOT even checking the sequence field in the icmp packet (which could reduce the possibility to match wrongly an echo reply).
We changed the source code a lot for strict checking synchronous icmp echo request and replies.
It seems (as reported by other users) that check_ping is not showing the flaws of check_icmp. We would suggest not to use check_icmp in production critical deployments.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: check_icmp flawed

Post by scottwilkerson »

Would you be willing to share your findings in an issue on the nagios-plugins project for review?
https://github.com/nagios-plugins/nagios-plugins
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Locked