Scheduled events over time piling up on "NOW"

johndoe · Post by **johndoe** » Wed Mar 19, 2014 7:45 am

[root@XXX ~]# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg | head -2
Nagios Core 3.5.0
[root@XXX ~]# tail -50 /var/log/messages | grep ndo
[root@XXX ~]# service nagios status
nagios (pid 2584) is running...
[root@XXX ~]# service ndo2db status
ndo2db (pid 2618) is running...

File sent via PM to avoid any sensitive info disclosure, checked and there seemed to be none but can never be too careful. Feel free to suggest any other improvements on the actual configs.

Post by **lmiltchev** » Wed Mar 19, 2014 4:24 pm

I didn't see anything weird in the nagios.cfg file, besides the fact that the sections were out of order. Anyway, go to:

Admin->System Profile->Download Profile

save and PM me the "profile.zip" file.

Post by **lmiltchev** » Thu Mar 20, 2014 1:15 pm

It seems like you have a crashed table in the nagiosql database.

140320 13:20:01 [ERROR] /usr/libexec/mysqld: Table './nagiosql/tbl_logbook' is marked as crashed and last (automatic?) repair failed

Run the following commands:

Code: Select all

cd /usr/local/nagiosxi/scripts
./repairmysql.sh nagios
./repairmysql.sh nagiosql
service nagios stop
killall nagios
service ndo2db stop
service ndo2db start
service nagios start

Check if nagios service is running:

Code: Select all

service nagios status

johndoe · Post by **johndoe** » Tue Mar 25, 2014 7:25 am

Did that, same problem...

scottwilkerson · Post by **scottwilkerson** » Tue Mar 25, 2014 4:59 pm

johndoe,

I had a revisit of the code that makes the up the "Monitoring Engine Event Queue" and what you are seeing is likely just because of how the data is queried and the fact that you have lots of passive checks, and almost all the hosts/services are reporting in very frequently and the "next_check" time is likely off into the future further than the likely time that a real check will come in...

This is gonna be ugly, but this is what the SQL looks like that is used to pull data and then be massaged into the XML that populates the graph

Code: Select all

SELECT COUNT(*) AS total_events,next_check, NOW() as time_now,
	TIMESTAMPDIFF(SECOND,NOW(),next_check) AS seconds_from_now,
	(TIMESTAMPDIFF(SECOND,NOW(),next_check) DIV 10) AS bucket
	FROM nagios_hoststatus
	WHERE TRUE 
	AND (TIMESTAMPDIFF(SECOND,NOW(),next_check) < 300)
	AND instance_id = '1'
	AND UNIX_TIMESTAMP(next_check) != 0
	GROUP BY instance_id, bucket
	UNION
	SELECT COUNT(*) AS total_events,next_check, NOW() as time_now,
	TIMESTAMPDIFF(SECOND,NOW(),next_check) AS seconds_from_now,
	(TIMESTAMPDIFF(SECOND,NOW(),next_check) DIV 10) AS bucket
	FROM nagios_servicestatus
	WHERE TRUE 
	AND (TIMESTAMPDIFF(SECOND,NOW(),next_check) < 300)
	AND instance_id = '1'
	AND UNIX_TIMESTAMP(next_check) != 0
	GROUP BY instance_id, bucket	
	ORDER BY bucket ASC LIMIT 10000

Long story short, I think this is just a product of your environment. you might be able to massage the graph by reducing the check_interval on your checks (even though you don't to active checks)

Nagios Support Forum

Scheduled events over time piling up on "NOW"

Re: Scheduled events over time piling up on "NOW"

Re: Scheduled events over time piling up on "NOW"

Re: Scheduled events over time piling up on "NOW"

Re: Scheduled events over time piling up on "NOW"

Re: Scheduled events over time piling up on "NOW"