Page 1 of 1

Monitoring engine not working

Posted: Tue Feb 05, 2013 4:15 am
by wiproltdwiv
Hi Team,

When we do applying configuration post that monitoring engine goes stop, also external commands also goes auto stop. Cause of checks not happen and Nagios look like as stand by. Currently we are monitoring approx 1300 hosts and 6000 services.

please find the error screenshot as well Nagios system profile.

Re: Monitoring engine not working

Posted: Tue Feb 05, 2013 10:07 am
by yancy
wiproltdwiv ,

Is there any error message when you apply configuration?

-Yancy

Re: Monitoring engine not working

Posted: Tue Feb 05, 2013 10:55 pm
by wiproltdwiv
No, it show apply configuration successfully.

Re: Monitoring engine not working

Posted: Wed Feb 06, 2013 10:32 am
by scottwilkerson
Can you post your nagios.cfg

Also, at what frequency are you monitoring the 1300 hosts and 6000 services?

What are the specs on your server CPU's, RAM

Re: Monitoring engine not working

Posted: Thu Feb 07, 2013 7:43 am
by wiproltdwiv
we have assign 10 to 15 check interval for all hosts and services. i have attached nagios.cfg also below are the hardware details.

[root@EMSNagios1 etc]# cat /proc/cpuinfo |more
processor : 23
vendor_id : GenuineIntel
cpu family : 6
model : 44
model name : Intel(R) Xeon(R) CPU E5645 @ 2.40GHz
stepping : 2
cpu MHz : 1600.000
cache size : 12288 KB
physical id : 1
siblings : 12
core id : 10
cpu cores : 6
apicid : 53
initial apicid : 53
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdts
cp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm d
ca sse4_1 sse4_2 popcnt aes lahf_lm ida arat epb dts tpr_shadow vnmi flexpriority ept vpid
bogomips : 4799.89
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management:

[root@EMSNagios1 etc]# uname -a
Linux EMSNagios1.co-opbank.co.in 2.6.32-279.2.1.el6.x86_64 #1 SMP Thu Jul 5 21:08:58 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux

[root@EMSNagios1 etc]# free -g
total used free shared buffers cached
Mem: 31 6 25 0 0 1
-/+ buffers/cache: 4 26
Swap: 15 0 15
[root@EMSNagios1 etc]#

Re: Monitoring engine not working

Posted: Thu Feb 07, 2013 10:59 am
by yancy
winproltdwiv,

Are you able to roll back to a working snapshot?

Core Config Manager->Configuration Snapshots


Thanks,

-Yancy

Re: Monitoring engine not working

Posted: Mon Feb 11, 2013 12:40 am
by wiproltdwiv
I can, but i am not getting any error snapshot all are getting successfully only.

Re: Monitoring engine not working

Posted: Mon Feb 11, 2013 10:20 am
by mguthrie
It seems like your system could be overtaxed, either with Disk IO, or CPU load. The following document might be worth a read.
http://assets.nagios.com/downloads/nagi ... rmance.pdf

Re: Monitoring engine not working

Posted: Wed Feb 13, 2013 5:05 am
by wiproltdwiv
I went through above doc, but as per our system h/w configuration and attached system status, i dont think we have that much load. Please check and suggest where we need to changes

Re: Monitoring engine not working

Posted: Wed Feb 13, 2013 10:47 am
by slansing
Have you tried rolling back to a previous snapshot? Regardless if they are stamped with error or are clean.