Nagios stops checking!!!
Re: Nagios stops checking!!!
I see you are considering caching objects with Squid. What objects specifically are you thinking of caching?
I may be able to help as I spent 10 years cutting my teeth on Squid.
I may be able to help as I spent 10 years cutting my teeth on Squid.
Mike Weber
Nagios Training/Consulting
Nagios Training/Consulting
Re: Nagios stops checking!!!
I am going to defer to Mike on the Squid questions. I have played with it in a previous life but I would not call myself an expert by any means.
Former Nagios employee
Re: Nagios stops checking!!!
Yepmrochelle wrote:When you indicate you turned off the check leveling, are you indicating you disabled auto rescheduling option in the nagios.cfg ?
2 of XI5.6.14 Prod/DR/DEV - Nagios LogServer 2 Nodes
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
Re: Nagios stops checking!!!
Well, it looks like auto-rescheduling needs to be reworked (again). Do you only see this behavior on large installs with auto-rescheduling enabled?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
-
krobertson71
- Posts: 444
- Joined: Tue Feb 11, 2014 10:16 pm
Re: Nagios stops checking!!!
I have this same exact Dashboard setup, just in a differnent order.. little neater. But you know I roll like that.mrochelle wrote:I have not experienced the spiking issue indicated but I'm joining the conversation since I logged in to post the Nagios stops checking since I've experience 3 such incidents over the past weekend up to this morning. As BanditBBS indicated, the load drops to minimal, checks go down to zero. No errors of any kind I can find, logs appear normal. I'm attaching an image shot from this morning 05:31AM the last occurrence. A restart of Nagios gets everything back to normal.
Also for the record, the ndo2db process is ok ( under 30% )during these incidents. Nagios 2014R2.0
CentOS release 6.3
Good dashboard.
Re: Nagios stops checking!!!
Thanks for the dashboard review. It helps my monitoring team to just send me a copy if nagios has a problem.
Also, I've turned off the auto rescheduling and will follow up with the results after a few days of observation.
Also, I've turned off the auto rescheduling and will follow up with the results after a few days of observation.
-
krobertson71
- Posts: 444
- Joined: Tue Feb 11, 2014 10:16 pm
Re: Nagios stops checking!!!
I was just looking at your dashboard again and I noticed you have a Max Service Check Execution time of 2199 seconds! That means you had a check take over 36 minutes to complete.mrochelle wrote:Thanks for the dashboard review. It helps my monitoring team to just send me a copy if nagios has a problem.
Also, I've turned off the auto rescheduling and will follow up with the results after a few days of observation.
I would try to find what check that is and when it started, then hung. Could be related to why all your other checks stopped.. this check possibly?
Just a possibility as that is a way excessive Service Execution Time.
Re: Nagios stops checking!!!
Yes, that is an actual service monitor that can take up to 45 minutes. It is actually an auto update procedure where a particular nagios server host configurations are synchronized with the reference source database of active hosts. Its only 13 monitors of the 11037.
-
slansing
- Posts: 7698
- Joined: Mon Apr 23, 2012 4:28 pm
- Location: Travelling through time and space...
Re: Nagios stops checking!!!
Yeah, let us know how things look. I've got a couple checks that take a while to come through as well, one being Windows Updates... takes ages...
Re: Nagios stops checking!!!
Ok, must not be the auto-recheduling. My schedule keeps emptying and no checks are being performed even with it off. I need help, this is very very bad!
The worst part is, sometimes when it says no checks are happening, I can see them happening when watching a top. But other times there is nothing running, so I can even rely on this: EDIT: Had to restart ndo2db to get that working again:
Edit #2 - This is the kind of weirdness that just freaks me out. After restarting NDO2DB my server hasn't run this well in ages...even though its been rebooted a couple times very recently. I have even applied changes a few times:
The worst part is, sometimes when it says no checks are happening, I can see them happening when watching a top. But other times there is nothing running, so I can even rely on this: EDIT: Had to restart ndo2db to get that working again:
Code: Select all
[root@iss-chi-nag05 ~]# service ndo2db restart
Stopping ndo2db: head: cannot open `/usr/local/nagios/var/ndo2db.lock' for reading: No such file or directory
done.
Starting ndo2db: done.
You do not have the required permissions to view the files attached to this post.
2 of XI5.6.14 Prod/DR/DEV - Nagios LogServer 2 Nodes
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github