Monitoring Engine Event Queue anomaly?

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Monitoring Engine Event Queue anomaly?

Post by Box293 »

Was there an answer to this?

Our events get up to about 4000 every hour and then drop to about 300, around the time of the hourly DB scripts run from what I understand.

CPU usage on the XI host goes up when this happens. This screenshot shows the past day CPU usage for this XI host, you can see the hourly job that occurs.
CPU 1 day summary.png
We only have active serivce checks, no passive.

Nagios XI 2012R1.7 VM running on ESXi 5.1.
You do not have the required permissions to view the files attached to this post.
Last edited by abrist on Wed Jun 19, 2013 10:22 am, edited 2 times in total.
Reason: Split topic as these issues are only potentially related.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Monitoring Engine Event Queue anomaly?

Post by lmiltchev »

Was there an answer to this?
...
We only have active serivce checks, no passive.
We haven't received a response from the customer since 04/21/2013, so I am not sure if this is resolved. His case was different though. The majority of the checks were passive checks, sent at similar times.
Be sure to check out our Knowledgebase for helpful articles and solutions!
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Monitoring Engine Event Queue anomaly?

Post by Box293 »

I'll put some notes together about this and come back to you with some helpful information.

I have a suggestion (that requires some explaining first).

We use the Nagios XI Server Monitoring Wizard on a test server to monitor our production server. This lets us know if the production server goes down or there is something wrong (it's really handy). In particular I am talking about the "Nagios XI Jobs" service.

Normally the status is "All jobs are running ok" however a condition can occur when the scheduled events over time build up WHILE a backup of the VM is occurring and in turn causes the Nagios XI Jobs service to report "Nagios XI Jobs;UNKNOWN;SOFT;3;Database Maintenance (dbmaint) stale (1203 seconds old), Database Maintenance (dbmaint) stale (1203 seconds old)".

My suggestion is that you incorporate performance data into this service so we can observe over time how long the database maintenance jobs take to run. OR define a new service in the wizard that tracks the database maintenance durations so we can observe this in pretty graphs.

I hope this makes sense.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Monitoring Engine Event Queue anomaly?

Post by lmiltchev »

It makes sense and I think it's a good idea. Please, post a feature request on our bug tracker, so that it won't "fall in the cracks". :)
Be sure to check out our Knowledgebase for helpful articles and solutions!
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Monitoring Engine Event Queue anomaly?

Post by Box293 »

OK so here's some more information.

As quoted from another post:
It would be worth running the mysql repair procedure:
http://assets.nagios.com/downloads/nagi ... tabase.pdf

As well as the vacuum commands on postgresql:
http://support.nagios.com/wiki/index.ph ... .22_in_log
Here is the log file with the output from these commands.
putty-2013-06-12.log
Also here is a better screenshot of the problem when it occurs.
Monitoring Engine Status.png
One other observation is that when the shceduled events over time build up like this, things like dashlets do not display properly. It's almost like the dashlet cannot access the database to get the service object (or something like that).

I'll post back here in a couple of hours with an update, to report back if the problem has been resolved since running these two repair / cleanup procedures.
You do not have the required permissions to view the files attached to this post.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Monitoring Engine Event Queue anomaly?

Post by Box293 »

FYI here is the bug tracker link http://tracker.nagios.com/view.php?id=412
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Monitoring Engine Event Queue anomaly?

Post by lmiltchev »

FYI here is the bug tracker link http://tracker.nagios.com/view.php?id=412
Thanks for the post, Troy!
Be sure to check out our Knowledgebase for helpful articles and solutions!
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Monitoring Engine Event Queue anomaly?

Post by Box293 »

OK so the two repair / cleanup procedures did not solve the problem.

The database maintenance job just ran and it did the same thing, scheduled events over time went up to 4000+ as the job ran over a 20 minute window.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Monitoring Engine Event Queue anomaly?

Post by scottwilkerson »

Troy,

There were some issues introduced in 2012r1.7 that could be causing the issue you are seeing. It affected how we query objects in the DB, and caused us to release 2012r1.8 shortly after.

I also like the feature request, when I get some time I try to get that in there...
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Monitoring Engine Event Queue anomaly?

Post by Box293 »

Thanks for that Scott.

A couple of hours ago I upgraded to 2012R2.2 however the problem has not been resolved.

The database maintenance job has run twice since and nothing seems to have changed, scheduled events over time went up to 4000+ as the job ran over a 20 minute window.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Locked