Was there an answer to this?
Our events get up to about 4000 every hour and then drop to about 300, around the time of the hourly DB scripts run from what I understand.
CPU usage on the XI host goes up when this happens. This screenshot shows the past day CPU usage for this XI host, you can see the hourly job that occurs.
We only have active serivce checks, no passive.
Nagios XI 2012R1.7 VM running on ESXi 5.1.
Monitoring Engine Event Queue anomaly?
- Box293
- Too Basu
- Posts: 5126
- Joined: Sun Feb 07, 2010 10:55 pm
- Location: Deniliquin, Australia
- Contact:
Monitoring Engine Event Queue anomaly?
You do not have the required permissions to view the files attached to this post.
Last edited by abrist on Wed Jun 19, 2013 10:22 am, edited 2 times in total.
Reason: Split topic as these issues are only potentially related.
Reason: Split topic as these issues are only potentially related.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Monitoring Engine Event Queue anomaly?
We haven't received a response from the customer since 04/21/2013, so I am not sure if this is resolved. His case was different though. The majority of the checks were passive checks, sent at similar times.Was there an answer to this?
...
We only have active serivce checks, no passive.
Be sure to check out our Knowledgebase for helpful articles and solutions!
- Box293
- Too Basu
- Posts: 5126
- Joined: Sun Feb 07, 2010 10:55 pm
- Location: Deniliquin, Australia
- Contact:
Re: Monitoring Engine Event Queue anomaly?
I'll put some notes together about this and come back to you with some helpful information.
I have a suggestion (that requires some explaining first).
We use the Nagios XI Server Monitoring Wizard on a test server to monitor our production server. This lets us know if the production server goes down or there is something wrong (it's really handy). In particular I am talking about the "Nagios XI Jobs" service.
Normally the status is "All jobs are running ok" however a condition can occur when the scheduled events over time build up WHILE a backup of the VM is occurring and in turn causes the Nagios XI Jobs service to report "Nagios XI Jobs;UNKNOWN;SOFT;3;Database Maintenance (dbmaint) stale (1203 seconds old), Database Maintenance (dbmaint) stale (1203 seconds old)".
My suggestion is that you incorporate performance data into this service so we can observe over time how long the database maintenance jobs take to run. OR define a new service in the wizard that tracks the database maintenance durations so we can observe this in pretty graphs.
I hope this makes sense.
I have a suggestion (that requires some explaining first).
We use the Nagios XI Server Monitoring Wizard on a test server to monitor our production server. This lets us know if the production server goes down or there is something wrong (it's really handy). In particular I am talking about the "Nagios XI Jobs" service.
Normally the status is "All jobs are running ok" however a condition can occur when the scheduled events over time build up WHILE a backup of the VM is occurring and in turn causes the Nagios XI Jobs service to report "Nagios XI Jobs;UNKNOWN;SOFT;3;Database Maintenance (dbmaint) stale (1203 seconds old), Database Maintenance (dbmaint) stale (1203 seconds old)".
My suggestion is that you incorporate performance data into this service so we can observe over time how long the database maintenance jobs take to run. OR define a new service in the wizard that tracks the database maintenance durations so we can observe this in pretty graphs.
I hope this makes sense.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Monitoring Engine Event Queue anomaly?
It makes sense and I think it's a good idea. Please, post a feature request on our bug tracker, so that it won't "fall in the cracks". 
Be sure to check out our Knowledgebase for helpful articles and solutions!
- Box293
- Too Basu
- Posts: 5126
- Joined: Sun Feb 07, 2010 10:55 pm
- Location: Deniliquin, Australia
- Contact:
Re: Monitoring Engine Event Queue anomaly?
OK so here's some more information.
As quoted from another post:
I'll post back here in a couple of hours with an update, to report back if the problem has been resolved since running these two repair / cleanup procedures.
As quoted from another post:
Here is the log file with the output from these commands. Also here is a better screenshot of the problem when it occurs. One other observation is that when the shceduled events over time build up like this, things like dashlets do not display properly. It's almost like the dashlet cannot access the database to get the service object (or something like that).It would be worth running the mysql repair procedure:
http://assets.nagios.com/downloads/nagi ... tabase.pdf
As well as the vacuum commands on postgresql:
http://support.nagios.com/wiki/index.ph ... .22_in_log
I'll post back here in a couple of hours with an update, to report back if the problem has been resolved since running these two repair / cleanup procedures.
You do not have the required permissions to view the files attached to this post.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
- Box293
- Too Basu
- Posts: 5126
- Joined: Sun Feb 07, 2010 10:55 pm
- Location: Deniliquin, Australia
- Contact:
Re: Monitoring Engine Event Queue anomaly?
FYI here is the bug tracker link http://tracker.nagios.com/view.php?id=412
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Monitoring Engine Event Queue anomaly?
Thanks for the post, Troy!FYI here is the bug tracker link http://tracker.nagios.com/view.php?id=412
Be sure to check out our Knowledgebase for helpful articles and solutions!
- Box293
- Too Basu
- Posts: 5126
- Joined: Sun Feb 07, 2010 10:55 pm
- Location: Deniliquin, Australia
- Contact:
Re: Monitoring Engine Event Queue anomaly?
OK so the two repair / cleanup procedures did not solve the problem.
The database maintenance job just ran and it did the same thing, scheduled events over time went up to 4000+ as the job ran over a 20 minute window.
The database maintenance job just ran and it did the same thing, scheduled events over time went up to 4000+ as the job ran over a 20 minute window.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Monitoring Engine Event Queue anomaly?
Troy,
There were some issues introduced in 2012r1.7 that could be causing the issue you are seeing. It affected how we query objects in the DB, and caused us to release 2012r1.8 shortly after.
I also like the feature request, when I get some time I try to get that in there...
There were some issues introduced in 2012r1.7 that could be causing the issue you are seeing. It affected how we query objects in the DB, and caused us to release 2012r1.8 shortly after.
I also like the feature request, when I get some time I try to get that in there...
- Box293
- Too Basu
- Posts: 5126
- Joined: Sun Feb 07, 2010 10:55 pm
- Location: Deniliquin, Australia
- Contact:
Re: Monitoring Engine Event Queue anomaly?
Thanks for that Scott.
A couple of hours ago I upgraded to 2012R2.2 however the problem has not been resolved.
The database maintenance job has run twice since and nothing seems to have changed, scheduled events over time went up to 4000+ as the job ran over a 20 minute window.
A couple of hours ago I upgraded to 2012R2.2 however the problem has not been resolved.
The database maintenance job has run twice since and nothing seems to have changed, scheduled events over time went up to 4000+ as the job ran over a 20 minute window.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.