Page 1 of 1

check_icmp flawed

Posted: Fri Aug 10, 2018 3:31 am
by odino2016
We have a rather large installation with hunderds of hosts and thousands of checks. Sometimes check_icmp is reporting OK checking hosts actually down. After a lot of investigation we downloaded the source code of check_icmp. It has been designed to accomplish multiple tasks and it is not stricly checking icmp echo replies. It uses raw sockets and captures all icmp incoming packets. It uses the process PID (which is a field of the icmp packet) to identify its own echo replies. Unfortunately the pid_t structures it uses its a signed short int. So, with hundreds of ongoing ping checks, it seems that sometimes there can be 2 check_icmp processes with two PIDs that converted to pid_t are the same signed int. Unfortunately check_icmp is NOT even checking the sequence field in the icmp packet (which could reduce the possibility to match wrongly an echo reply).
We changed the source code a lot for strict checking synchronous icmp echo request and replies.
It seems (as reported by other users) that check_ping is not showing the flaws of check_icmp. We would suggest not to use check_icmp in production critical deployments.

Re: check_icmp flawed

Posted: Fri Aug 10, 2018 7:14 am
by scottwilkerson
Would you be willing to share your findings in an issue on the nagios-plugins project for review?
https://github.com/nagios-plugins/nagios-plugins