johndoe,
I had a revisit of the code that makes the up the "Monitoring Engine Event Queue" and what you are seeing is likely just because of how the data is queried and the fact that you have lots of passive checks, and almost all the hosts/services are reporting in very frequently and the "next_check" time is likely off into the future further than the likely time that a real check will come in...
This is gonna be ugly, but this is what the SQL looks like that is used to pull data and then be massaged into the XML that populates the graph
Code: Select all
SELECT COUNT(*) AS total_events,next_check, NOW() as time_now,
TIMESTAMPDIFF(SECOND,NOW(),next_check) AS seconds_from_now,
(TIMESTAMPDIFF(SECOND,NOW(),next_check) DIV 10) AS bucket
FROM nagios_hoststatus
WHERE TRUE
AND (TIMESTAMPDIFF(SECOND,NOW(),next_check) < 300)
AND instance_id = '1'
AND UNIX_TIMESTAMP(next_check) != 0
GROUP BY instance_id, bucket
UNION
SELECT COUNT(*) AS total_events,next_check, NOW() as time_now,
TIMESTAMPDIFF(SECOND,NOW(),next_check) AS seconds_from_now,
(TIMESTAMPDIFF(SECOND,NOW(),next_check) DIV 10) AS bucket
FROM nagios_servicestatus
WHERE TRUE
AND (TIMESTAMPDIFF(SECOND,NOW(),next_check) < 300)
AND instance_id = '1'
AND UNIX_TIMESTAMP(next_check) != 0
GROUP BY instance_id, bucket
ORDER BY bucket ASC LIMIT 10000
Long story short, I think this is just a product of your environment. you might be able to massage the graph by reducing the check_interval on your checks (even though you don't to active checks)