NSCA close/POLLNVAL/accept bug - patch attached
NSCA close/POLLNVAL/accept bug - patch attached
In summary: nsca closes client connection file descriptors on error or eof and relies on poll returning POLLNVAL to clean up after them (in particular to remove their handlers and take them out of the poll set). If in the same loop accept() returns the same FD as a recently closed FD the wrong handler will be run on it.
This can result in calling recv() on an FD with no data to read and O_NONBLOCK not set.
Awiddershiem's commit to nsca-aw here here works around this bug by setting O_NONBLOCK earlier, but that code is quite divergent from the nsca core now.
Attached (I hope) is a minimal-change patch that fixes the behavior. It's not the best patch, which would be to redesign nsca to manage the handlers and remove the one-shot behavior where the loop clears them each time (which results in amusing comments such as "DO NOT REMOVE! 01/29/2007 single process daemon will fail if this is removed").
This patch is against 2.7.2 but it should apply cleanly to the current r2763. It includes a fix to another problem I don't think can actually be triggered, where the poll loop could theoretically run a handler for a just-added FD because npfds++ is run inside the loop which tests against <npfds.
I'm happy to go through the bug code path in more detail if anyone wants, it was quite tricky to track down.
Last edited by mib on Tue Sep 30, 2014 7:09 pm, edited 1 time in total.
Re: NSCA close/POLLNVAL/accept bug - patch attached
Thanks for the patch! While the support forums get seen a lot by the support staff, our developers are busy high up on a snow-covered mountain peak in a monastery working on their code and they don't often visit the forums. A great place to submit bug reports (or better yet, bug reports with patches included) would be http://tracker.nagios.com for our Nagios XI product, and http://tracker.nagios.org for our Nagios Core product.
Former Nagios employee
Re: NSCA close/POLLNVAL/accept bug - patch attached
Is NSCA part of Nagios Core? When I downloaded the Nagios Core tarball it wasn't included.
-
- -fno-stack-protector
- Posts: 4366
- Joined: Mon Nov 19, 2012 12:10 pm
Re: NSCA close/POLLNVAL/accept bug - patch attached
Thanks for the tracker bug! No core and nsca are separate products and are not inclusive of each other. They do work together, but one is not needed for the other to operate in any way.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
Re: NSCA close/POLLNVAL/accept bug - patch attached
My question is really:
Who, if anyone, is maintaining NSCA and would fix this bug?
Given that it appears the commercial Nagios XI doesn't use NSCA, and NSCA is not part of nagios-core. Two-year-old bug reports on the NSCA part of the tracker are still in "NEW" state.
Who, if anyone, is maintaining NSCA and would fix this bug?
Given that it appears the commercial Nagios XI doesn't use NSCA, and NSCA is not part of nagios-core. Two-year-old bug reports on the NSCA part of the tracker are still in "NEW" state.
-
- -fno-stack-protector
- Posts: 4366
- Joined: Mon Nov 19, 2012 12:10 pm
Re: NSCA close/POLLNVAL/accept bug - patch attached
Actually, we do use nsca in several of our commercial products, and do maintain it as issues become apparent. Most likely either myself or one of the other C devs will take a look and modify or apply the patch as needed. I have some plugins work that is going to take precedence, but I will absolutely take a look at this patch and implementation of it if no one else beats me to it.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
Re: NSCA close/POLLNVAL/accept bug - patch attached
Fantastic, thanks for your prompt responses and assistance.
Re: NSCA close/POLLNVAL/accept bug - patch attached
No problem. We actually maintain the github repo for nsca: https://github.com/NagiosEnterprises/nsca
It would be a great help if you could submit your patch with your explanation there.
Thanks!
It would be a great help if you could submit your patch with your explanation there.
Thanks!
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Re: NSCA close/POLLNVAL/accept bug - patch attached
The minimal work-around patch is in this commit:
https://github.com/mbattersby/nsca/comm ... 468d03f514
for which I also sent a pull request.
https://github.com/mbattersby/nsca/comm ... 468d03f514
for which I also sent a pull request.