Lots of NRDS checks and RAM disk filling up
Lots of NRDS checks and RAM disk filling up
Since implementing just some of the NRDS checks on a number of remote clients, I'm starting to run into issues with the RAM disk filling up very rapidly.
I'm a little lost as to what all to check as I've followed the suggestions from a previous thread without results: http://support.nagios.com/forum/viewtop ... iosramdisk
Any insight would be greatly appreciated.
Currently, I have the following:
# Active Host / Service Checks: 1606 / 2920
# Passive Host / Service Checks: 149 / 9584
Would providing system specs possibly help in resolving?
I'm a little lost as to what all to check as I've followed the suggestions from a previous thread without results: http://support.nagios.com/forum/viewtop ... iosramdisk
Any insight would be greatly appreciated.
Currently, I have the following:
# Active Host / Service Checks: 1606 / 2920
# Passive Host / Service Checks: 149 / 9584
Would providing system specs possibly help in resolving?
-
sreinhardt
- -fno-stack-protector
- Posts: 4366
- Joined: Mon Nov 19, 2012 12:10 pm
Re: Lots of NRDS checks and RAM disk filling up
System specs would be very helpful, also current load, what npcd threshold is set to, and what size your ramdisk is presently. I will say, that's a good number of passive checks(~10k), but there is no reason we can't help it along a bit.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
Re: Lots of NRDS checks and RAM disk filling up
System Specs and current load:sreinhardt wrote:System specs would be very helpful, also current load, what npcd threshold is set to, and what size your ramdisk is presently. I will say, that's a good number of passive checks(~10k), but there is no reason we can't help it along a bit.
Code: Select all
proc]# cat cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 46
model name : Intel(R) Xeon(R) CPU L7555 @ 1.87GHz
stepping : 6
cpu MHz : 1861.533
cache size : 24576 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 2
apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx lm constant_tsc pni ssse3 cx16 sse4_1 sse4_2 popcnt lahf_lm
bogomips : 3723.06
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management:
processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 46
model name : Intel(R) Xeon(R) CPU L7555 @ 1.87GHz
stepping : 6
cpu MHz : 1861.533
cache size : 24576 KB
physical id : 0
siblings : 2
core id : 1
cpu cores : 2
apicid : 1
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx lm constant_tsc pni ssse3 cx16 sse4_1 sse4_2 popcnt lahf_lm
bogomips : 3942.75
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management:
Code: Select all
proc]# cat meminfo
MemTotal: 12309660 kB
MemFree: 9377148 kB
Buffers: 245544 kB
Cached: 1355500 kB
SwapCached: 0 kB
Active: 1822980 kB
Inactive: 691240 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 12309660 kB
LowFree: 9377148 kB
SwapTotal: 18972656 kB
SwapFree: 18972656 kB
Dirty: 67148 kB
Writeback: 0 kB
AnonPages: 913132 kB
Mapped: 56700 kB
Slab: 347172 kB
PageTables: 29180 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
CommitLimit: 25127484 kB
Committed_AS: 1708812 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 9512 kB
VmallocChunk: 34359728847 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
Hugepagesize: 2048 kB
Code: Select all
top - 14:11:33 up 3:03, 1 user, load average: 5.87, 6.88, 7.48
Tasks: 238 total, 3 running, 235 sleeping, 0 stopped, 0 zombie
Cpu(s): 53.8%us, 1.5%sy, 0.0%ni, 44.0%id, 0.0%wa, 0.5%hi, 0.2%si, 0.0%st
Mem: 12309660k total, 2853580k used, 9456080k free, 240092k buffers
Swap: 18972656k total, 0k used, 18972656k free, 1331036k cachedin process_perfdata.pl - I just this morning raised it to 15 from 10
Code: Select all
TIMEOUT = 15Code: Select all
load_threshold = 40.0Code: Select all
tmpfs 75M 75M 8.0K 100% /var/nagiosramdiskCurrently, I have these systems set to respond every 1 min. I'm going to guess that I might need to raise that to at least a couple of minutes to help decrease the load?
Re: Lots of NRDS checks and RAM disk filling up
Wouldn't hurt.jbennett wrote:Currently, I have these systems set to respond every 1 min. I'm going to guess that I might need to raise that to at least a couple of minutes to help decrease the load?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Re: Lots of NRDS checks and RAM disk filling up
I will have to change this manually by running the ./installnrds {hostname} {time} on 140 different machines.
This is possible, but in the mean time, I'm wondering if increasing the RAM disk might not help solve the issues temporarially?
I'm guessing that what's happening here is that the system cannot process these checks in the 1 min time frame given. If I can increase the check time frequency to something like 3 minutes, it should have more time to run through all of the checks and process them before new ones come in, correct?
This is possible, but in the mean time, I'm wondering if increasing the RAM disk might not help solve the issues temporarially?
I'm guessing that what's happening here is that the system cannot process these checks in the 1 min time frame given. If I can increase the check time frequency to something like 3 minutes, it should have more time to run through all of the checks and process them before new ones come in, correct?
Re: Lots of NRDS checks and RAM disk filling up
I have changed the cron to be every 3 minutes instead of every 1 minute on most of the machines and I have removed the service & host-perfdata files and restarted Nagios.
I have also deactivated a number of service checks in the mean time.
Am I correct in seeing that I can resize the ram disk with the following command:
I have also deactivated a number of service checks in the mean time.
Am I correct in seeing that I can resize the ram disk with the following command:
Code: Select all
mount -o remount,size=150M tmpfs /tmp/nagiosramdisk-
sreinhardt
- -fno-stack-protector
- Posts: 4366
- Joined: Mon Nov 19, 2012 12:10 pm
Re: Lots of NRDS checks and RAM disk filling up
Yes I do believe that should work. You might want to copy off the data currently there to be sure it doesn't get lost unless you are not worried about it.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
Re: Lots of NRDS checks and RAM disk filling up
Backing the checks off to every 3 minutes has helped prevent the RAM disk from filling up.
Re: Lots of NRDS checks and RAM disk filling up
Sounds good - let us know if you have any more issues.
Be sure to check out our Knowledgebase for helpful articles and solutions!