HI Guys,
Ever since the upgrade to core 3.5 the load avg on the boxes has gone bonkers, we are seeing huge spikes of events suddenly growing and then dropping of see attached. Is this normal as we are hitting load avg of around 65 to 70 at times never seen that in my life. This is without adding anything on the box.
Do you guys have any idea if I can do something to fix this issue.
Monitoring Engine Queue and Weird Performance issue since
Monitoring Engine Queue and Weird Performance issue since
You do not have the required permissions to view the files attached to this post.
Re: Monitoring Engine Queue and Weird Performance issue sinc
It looks like snmpwalk is the culprit. Have you created a check that uses snmpwalk?
Do you have many snmpwalk wizards open?
Do you know why there are so many snmpwalk processes running?
If your answer is "no" to all of the above questions, try:
Do you have many snmpwalk wizards open?
Do you know why there are so many snmpwalk processes running?
If your answer is "no" to all of the above questions, try:
Code: Select all
killall snmpwalkFormer Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Re: Monitoring Engine Queue and Weird Performance issue sinc
Hi Andy,
That's not the problem as we have some checks which uses snmpwalk and reads data from multiple oids. What I am seeing is a huge number of checks running at the same time and not getting distributed properly.
Please see the event queue ..screenshot. It doesn't look healthy to me.
That's not the problem as we have some checks which uses snmpwalk and reads data from multiple oids. What I am seeing is a huge number of checks running at the same time and not getting distributed properly.
Please see the event queue ..screenshot. It doesn't look healthy to me.
You do not have the required permissions to view the files attached to this post.
Re: Monitoring Engine Queue and Weird Performance issue sinc
The bars are not the best measure of check distribution, but with that number of walks, Nagios will have a hard time figuring out how to schedule those walks.arnab.roy wrote:What I am seeing is a huge number of checks running at the same time and not getting distributed properly.
How long does each walk take to perform?
Are these full tree walks?
Could these walks be performed by some less intensive like snmpget?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Monitoring Engine Queue and Weird Performance issue sinc
Unfortunately that screenshot cuts off some important information, what are the service execution time numbers?