abrist wrote:Then stat() is definitely failing. I thought it was odd that the alarm as not triggering on timeout. . .well, there was no alarm
So I added one. Can you test this new bin to see if it times out?
AHH, even worse! It isn't timing out and I can't about the application anymore(started new session and killed it)
2 of XI5.6.14 Prod/DR/DEV - Nagios LogServer 2 Nodes
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
Weird....I can abort before the timeout value but if I try after the timeout value I can not ctrl-c it.
2 of XI5.6.14 Prod/DR/DEV - Nagios LogServer 2 Nodes
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
That is seriously odd.
Can we *pretty please* have a remote session?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Yes Andy, we can do remote, but no compiling on that box....guess we could compile on the nagios server first and xfer the file.
2 of XI5.6.14 Prod/DR/DEV - Nagios LogServer 2 Nodes
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
Or just try running the bin attached. I want to see if we can detect the stale mount point.
You do not have the required permissions to view the files attached to this post.
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
2 of XI5.6.14 Prod/DR/DEV - Nagios LogServer 2 Nodes
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
Well, Dang. I guess we could just fork a child process (to stat the mount) and kill the pid after a timeout (if it is stale). I don't think any of the other nagios-plugins use forks though, so this may not generally be a good idea. A number of them do. I will look into forking stat() and then running a timeout on the child process - allowing us to kill it after timeout and exit critical. Sound fair?
Anyone out there have any other good ideas for detecting stale mounts using C?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
abrist wrote:Well, Dang. I guess we could just fork a child process (to stat the mount) and kill the pid after a timeout (if it is stale). I don't think any of the other nagios-plugins use forks though, so this may not generally be a good idea. A number of them do. I will look into forking stat() and then running a timeout on the child process - allowing us to kill it after timeout and exit critical. Sound fair?
Anyone out there have any other good ideas for detecting stale mounts using C?
Definitely sounds fair to me Andy. Really, that seems about the best method and the other benefit would be I can remove my NFS check and just use check_disk to alert on stale mounts! That'll remove 1000 + checks for me
2 of XI5.6.14 Prod/DR/DEV - Nagios LogServer 2 Nodes
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
I am worried though that we would have to fork() for each mount checked, track the pids, and then report on those that timeout or something. This could actually get a bit complicated, and add to the overhead of check_disk. I may need to think about this a bit.
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.