Monitoring Engine Queue and Weird Performance issue since

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
User avatar
arnab.roy
Posts: 354
Joined: Sat Apr 30, 2011 10:24 am

Monitoring Engine Queue and Weird Performance issue since

Post by arnab.roy »

HI Guys,

Ever since the upgrade to core 3.5 the load avg on the boxes has gone bonkers, we are seeing huge spikes of events suddenly growing and then dropping of see attached. Is this normal as we are hitting load avg of around 65 to 70 at times never seen that in my life. This is without adding anything on the box.

Do you guys have any idea if I can do something to fix this issue.
You do not have the required permissions to view the files attached to this post.
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Monitoring Engine Queue and Weird Performance issue sinc

Post by abrist »

It looks like snmpwalk is the culprit. Have you created a check that uses snmpwalk?
Do you have many snmpwalk wizards open?
Do you know why there are so many snmpwalk processes running?

If your answer is "no" to all of the above questions, try:

Code: Select all

killall snmpwalk
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
User avatar
arnab.roy
Posts: 354
Joined: Sat Apr 30, 2011 10:24 am

Re: Monitoring Engine Queue and Weird Performance issue sinc

Post by arnab.roy »

Hi Andy,

That's not the problem as we have some checks which uses snmpwalk and reads data from multiple oids. What I am seeing is a huge number of checks running at the same time and not getting distributed properly.

Please see the event queue ..screenshot. It doesn't look healthy to me.
You do not have the required permissions to view the files attached to this post.
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Monitoring Engine Queue and Weird Performance issue sinc

Post by abrist »

arnab.roy wrote:What I am seeing is a huge number of checks running at the same time and not getting distributed properly.
The bars are not the best measure of check distribution, but with that number of walks, Nagios will have a hard time figuring out how to schedule those walks.

How long does each walk take to perform?
Are these full tree walks?
Could these walks be performed by some less intensive like snmpget?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Monitoring Engine Queue and Weird Performance issue sinc

Post by scottwilkerson »

Unfortunately that screenshot cuts off some important information, what are the service execution time numbers?
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Locked