Scheduled events over time piling up on "NOW"

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
johndoe
Posts: 114
Joined: Fri Oct 28, 2011 10:14 am

Scheduled events over time piling up on "NOW"

Post by johndoe »

So it seems my scheduled events over time are somehow piling up on the "NOW", we use 99% passive checks hence the high amount of them. Check the attachment.

How can i make this better? we get results from machines at 1 minute intervals mostly
You do not have the required permissions to view the files attached to this post.
Nagios XI 2012R2.8c Running on Ubuntu 12.04 Using 99% passive checks for monitoring
Monitoring nearly 800 Passive services spread through roughly 40 machines
Running on an 8 core, KVM virtualized VM, with 15 GB of RAM and using RAMDisk
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Scheduled events over time piling up on "NOW"

Post by tmcdonald »

We'll need a bit more context than that.

How many CPU threads do you have total?
How much memory is installed?
Is ndoutils running?

Also, are you sure about that 99%? Because if that is right, then only 1% of the checks are active and it looks like you have 600 or so active checks. And if 600 is 1%, then 59,400 would be the other 99%. That's a lot of passive checks.
Former Nagios employee
johndoe
Posts: 114
Joined: Fri Oct 28, 2011 10:14 am

Re: Scheduled events over time piling up on "NOW"

Post by johndoe »

Hi Tmcdonald,

I have 8 cores on this machine on a KVM virtualized VM, machine has 15gb of ram and is using ramdisk and i believe all optimizations mentioned on all nagios documents i could find...
Actually the only active checks we do are to check two websites (via ping) and that is once a minute or so.
We are currently monitoring 731 services on 37 hosts, these are mostly checked at 1 minute intervals...
Ndoutils (ndo2db) is running and on nagios status page all is green...

What other info can i provide you with?

Note: Sometime ago, perhaps half a year ago, i do remember things popping up on the logs saying that active checks would be scheduled, something along the lines of "service hasn-t been checked for a while, scheduling active check now" or something similar, i can-t seem to find these anymore on the logs when i searched for them now.. maybe unrelated to this issue but thought i would mention.
Nagios XI 2012R2.8c Running on Ubuntu 12.04 Using 99% passive checks for monitoring
Monitoring nearly 800 Passive services spread through roughly 40 machines
Running on an 8 core, KVM virtualized VM, with 15 GB of RAM and using RAMDisk
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Scheduled events over time piling up on "NOW"

Post by scottwilkerson »

Is your DB offloaded to a different server? If so, are the times on the servers synced?
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
johndoe
Posts: 114
Joined: Fri Oct 28, 2011 10:14 am

Re: Scheduled events over time piling up on "NOW"

Post by johndoe »

No, all happens on this server

as for time...

Code: Select all

Date/Time

PHP Timezone: UTC 
PHP Time: Fri, 14 Mar 2014 13:06:07 +0000
System Time: Fri, 14 Mar 2014 13:06:07 +0000
Nagios XI 2012R2.8c Running on Ubuntu 12.04 Using 99% passive checks for monitoring
Monitoring nearly 800 Passive services spread through roughly 40 machines
Running on an 8 core, KVM virtualized VM, with 15 GB of RAM and using RAMDisk
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: Scheduled events over time piling up on "NOW"

Post by slansing »

System and PHP times look good. Also what is the output of the following:

Code: Select all

ls /usr/local/nagios/var/spool/checkresults/ | wc -l
And could you run this quick test and let us know how the event queue looks afterwards?:

Code: Select all

service nagios stop

service ndo2db stop

service nagios start
Then wait about 10 seconds, and run:

Code: Select all

service ndo2db start
johndoe
Posts: 114
Joined: Fri Oct 28, 2011 10:14 am

Re: Scheduled events over time piling up on "NOW"

Post by johndoe »

Actually I have placed that folder(checkresults) on the ramdisk as directed by one of the performance improving tutorials, so count is as follows:

Code: Select all

[root@XX checkresults]# ls -lha | wc -l
203
[root@XX checkresults]# pwd
/var/nagiosramdisk/spool/checkresults
[root@XX checkresults]# ls -lha | wc -l
93
Note: the count on the actual folder you requested is 0 since they are moved to the ramdisk as previously mentioned

Strangely enough the value on the graph still shows the same on NOW (roughly above the 576 line), strange that it doesnt osccilate and just stays still...?

As for the starting and stopping sequence you mentioned, the results are as follows:

Code: Select all

[root@XX checkresults]# service nagios stop && service ndo2db stop && service nagios start
Stopping nagios: .done.
Stopping ndo2db: head: cannot open `/usr/local/nagios/var/ndo2db.lock' for reading: No such file or directory
done.
Starting nagios: done.
You have mail in /var/spool/mail/root
[root@XX checkresults]# service ndo2db stop
ndo2db was not running... could not stop
[root@XX checkresults]# service ndo2db start^C
[root@XX checkresults]# service nagios start
Starting nagios: done.
[root@XX checkresults]# service ndo2db start
Starting ndo2db: done.
[root@XX checkresults]# ls -lha | wc -l
173
You have mail in /var/spool/mail/root
[root@XX checkresults]# ls -lha | wc -l
15
[root@XX checkresults]# ls -lha | wc -l
19
After this, value on the graph was still on the same values
Nagios XI 2012R2.8c Running on Ubuntu 12.04 Using 99% passive checks for monitoring
Monitoring nearly 800 Passive services spread through roughly 40 machines
Running on an 8 core, KVM virtualized VM, with 15 GB of RAM and using RAMDisk
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: Scheduled events over time piling up on "NOW"

Post by slansing »

So it looks like you have mostly passive checks, for your active checks (I'd assume host ping checks, etc) are they actually being scheduled correctly? Are the times being displayed as normal, and checks constantly occurring, and being scheduled at their specified intervals? Can you run through the aforementioned restart procedure for NDO and nagios, and then post the output of the following?:

Code: Select all

tail -100 /usr/local/nagios/var/nagios.log
johndoe
Posts: 114
Joined: Fri Oct 28, 2011 10:14 am

Re: Scheduled events over time piling up on "NOW"

Post by johndoe »

Slansing,

I have previously mentioned a problem which i think might be affecting this http://support.nagios.com/forum/viewtop ... =6&t=25790

I see alot of those entries for hosts that are actually transmitting passive checks but that i do not yet want to configure at this stage. Can this be what is causing the high number on NOW ?

The log kind of becomes useless since it is hard to find anything on it other than those lines
Nagios XI 2012R2.8c Running on Ubuntu 12.04 Using 99% passive checks for monitoring
Monitoring nearly 800 Passive services spread through roughly 40 machines
Running on an 8 core, KVM virtualized VM, with 15 GB of RAM and using RAMDisk
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Scheduled events over time piling up on "NOW"

Post by lmiltchev »

Run the following commands and show the output:

Code: Select all

/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg | head -2
tail -50 /var/log/messages | grep ndo
service nagios status
service ndo2db status
Also, upload the "nagios.cfg" file, so that we can review it.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked