Nagios stops working

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
bushi
Posts: 9
Joined: Wed May 26, 2010 10:36 am

Nagios stops working

Post by bushi »

Hi guys,

we have a problem with one of our nagios instances.
Its a weird behavior ... When nagios is started everything works fine but after some time there are no service checks executed anymore. The Nagios process is still running and takes 100% cpu load on 1 Core.
So we enabled debugging and looked deeper in this. We can see that soon as the behavior starts nagios is only executing the "Event Check Loop" anymore and this a lot! The logfile grows for 100MB / minute.

In this particular example the last servicecheck was executed on 10-02-2016 23:43:26 and i was watching the entry on Thu Feb 11 09:44:01 2016
[1455180237.632369] [008.1] [pid=5503] ** Event Check Loop
[1455180237.632373] [008.1] [pid=5503] Next High Priority Event Time: Thu Feb 11 09:44:01 2016
[1455180237.632377] [008.1] [pid=5503] Next Low Priority Event Time: Wed Feb 10 23:42:46 2016
[1455180237.632380] [008.1] [pid=5503] Current/Max Service Checks: 0/0
[1455180237.632382] [024.1] [pid=5503] We're not executing host checks right now, so we'll skip this event.
[1455180237.632384] [001.0] [pid=5503] remove_event()
[1455180237.632386] [064.1] [pid=5503] Making callbacks (type 8)...
[1455180237.632388] [001.0] [pid=5503] reschedule_event()
[1455180237.632390] [001.0] [pid=5503] add_event()
[1455180237.632392] [064.1] [pid=5503] Making callbacks (type 8)...
[1455180237.632400] [064.1] [pid=5503] Making callbacks (type 19)...
So the "Next Low Priority Event Time:" is always in the past and has the date from when nagios stopped executing service checks.
Theses entries are repeated all the time and are the only information in the debug file - no other messages in there.

I checked google but found only one message regarding the same problem: http://permalink.gmane.org/gmane.networ ... user/73941
I found no answer to this problem.

We try to get a log where we have some entries before the problem occurs (Which is not so easy because of the amount of data being written)... maybe we find the problem there

Do you have an idea what to do in order to fix this problem?

Some information regarding this nagios instance:
SLES 11 SP4 (Linux xxx 3.0.101-68-default #1 SMP Tue Dec 1 16:21:37 UTC 2015 (ed01a9f) x86_64 x86_64 x86_64 GNU/Linux)
Nagios Stats 3.5.1
Copyright (c) 2003-2008 Ethan Galstad (http://www.nagios.org)
Last Modified: 08-30-2013
License: GPL

CURRENT STATUS DATA
------------------------------------------------------
Status File: /dev/shm/status.dat
Status File Age: 0d 0h 0m 4s
Status File Version: 3.5.1

Program Running Time: 0d 0h 21m 34s
Nagios PID: 56643
Used/High/Total Command Buffers: 0 / 0 / 4096

Total Services: 2280
Services Checked: 2237
Services Scheduled: 2191
Services Actively Checked: 2280
Services Passively Checked: 0
Total Service State Change: 0.000 / 26.640 / 0.080 %
Active Service Latency: 0.000 / 0.419 / 0.133 sec
Active Service Execution Time: 0.000 / 123.864 / 1.444 sec
Active Service State Change: 0.000 / 26.640 / 0.080 %
Active Services Last 1/5/15/60 min: 355 / 1598 / 1987 / 2140
Passive Service Latency: 0.000 / 0.000 / 0.000 sec
Passive Service State Change: 0.000 / 0.000 / 0.000 %
Passive Services Last 1/5/15/60 min: 0 / 0 / 0 / 0
Services Ok/Warn/Unk/Crit: 2196 / 30 / 2 / 52
Services Flapping: 0
Services In Downtime: 0

Total Hosts: 285
Hosts Checked: 285
Hosts Scheduled: 0
Hosts Actively Checked: 284
Host Passively Checked: 1
Total Host State Change: 0.000 / 10.260 / 0.036 %
Active Host Latency: 0.000 / 16.771 / 0.199 sec
Active Host Execution Time: 0.133 / 29.189 / 0.826 sec
Active Host State Change: 0.000 / 10.260 / 0.036 %
Active Hosts Last 1/5/15/60 min: 1 / 1 / 3 / 11
Passive Host Latency: 0.519 / 0.519 / 0.519 sec
Passive Host State Change: 0.000 / 0.000 / 0.000 %
Passive Hosts Last 1/5/15/60 min: 0 / 0 / 0 / 0
Hosts Up/Down/Unreach: 284 / 1 / 0
Hosts Flapping: 0
Hosts In Downtime: 0

Active Host Checks Last 1/5/15 min: 4 / 45 / 168
Scheduled: 0 / 0 / 0
On-demand: 4 / 45 / 168
Parallel: 0 / 1 / 5
Serial: 0 / 0 / 0
Cached: 3 / 44 / 163
Passive Host Checks Last 1/5/15 min: 0 / 0 / 0
Active Service Checks Last 1/5/15 min: 358 / 1668 / 5070
Scheduled: 358 / 1668 / 5070
On-demand: 0 / 0 / 0
Cached: 0 / 0 / 0
Passive Service Checks Last 1/5/15 min: 0 / 0 / 0

External Commands Last 1/5/15 min: 0 / 0 / 0
[1455187515] Nagios 3.5.1 starting... (PID=56642)
[1455187515] Local time is Thu Feb 11 11:45:15 CET 2016
[1455187515] LOG VERSION: 2.0
[1455187515] livestatus: Livestatus 1.2.4p4 by Mathias Kettner. Socket: '/tmp/.watchit.livestatus'
[1455187515] livestatus: Please visit us at http://mathias-kettner.de/
[1455187515] livestatus: Hint: please try out OMD - the Open Monitoring Distribution
[1455187515] livestatus: Please visit OMD at http://omdistro.org
[1455187515] livestatus: Finished initialization. Further log messages go to /opt/nagios/var/livestatus.log
[1455187515] Event broker module '/opt/nagios/lib/mk-livestatus/livestatus.o' initialized successfully.
[1455187515] Finished daemonizing... (New PID=56643)
User avatar
hsmith
Agent Smith
Posts: 3539
Joined: Thu Jul 30, 2015 11:09 am
Location: 127.0.0.1
Contact:

Re: Nagios stops working

Post by hsmith »

Are you running Nagios, or a fork of Nagios? The support we can provide for forks is limited. It also would appear you're running Livestatus, which is another piece of software we're not able to provide support for. I looked around for some information about the bug you're experiencing, and a site for one of the forks of Nagios came up, so it may be a problem with their implementation if you're running that. Let us know, thanks!
Former Nagios Employee.
me.
bushi
Posts: 9
Joined: Wed May 26, 2010 10:36 am

Re: Nagios stops working

Post by bushi »

Thanks for your answer!

No, we are not using a forked nagios version - just the standard Nagios 3.5.1, nothing was modified in the code xor webinterface.
I could manage to get a full debug log with the data before the looping starts and i will start to analyze it now. (70GB)

Now i removed objects.cache and retention.dat, lets see if that helps.
My next step will be disabling livestatus.
bushi
Posts: 9
Joined: Wed May 26, 2010 10:36 am

Re: Nagios stops working

Post by bushi »

Here is the debug output from exactly the time where the Event Check Loop starts.
I also included the last servicecheck entry, so the weird behavior starts at "1455206156.157168"

Code: Select all

[1455206156.155726] [008.1] [pid=63705] ** Event Check Loop
[1455206156.155736] [008.1] [pid=63705] Next High Priority Event Time: Thu Feb 11 16:55:56 2016
[1455206156.155742] [008.1] [pid=63705] Next Low Priority Event Time:  Thu Feb 11 16:55:56 2016
[1455206156.155745] [008.1] [pid=63705] Current/Max Service Checks: 2/0
[1455206156.155751] [001.0] [pid=63705] handle_timed_event() start
[1455206156.155753] [064.1] [pid=63705] Making callbacks (type 8)...
[1455206156.155756] [064.2] [pid=63705] Callback #1 (type 8) return code = 0
[1455206156.155761] [008.0] [pid=63705] ** Timed Event ** Type: EVENT_COMMAND_CHECK, Run Time: Thu Feb 11 16:55:56 2016
[1455206156.155765] [008.0] [pid=63705] ** External Command Check Event
[1455206156.155768] [001.0] [pid=63705] check_for_external_commands()
[1455206156.155771] [064.1] [pid=63705] Making callbacks (type 18)...
[1455206156.155774] [001.0] [pid=63705] handle_timed_event() end
[1455206156.155777] [001.0] [pid=63705] reschedule_event()
[1455206156.155779] [001.0] [pid=63705] add_event()
[1455206156.155782] [064.1] [pid=63705] Making callbacks (type 8)...
[1455206156.155785] [064.2] [pid=63705] Callback #1 (type 8) return code = 0
[1455206156.155787] [008.1] [pid=63705] ** Event Check Loop
[1455206156.155792] [008.1] [pid=63705] Next High Priority Event Time: Thu Feb 11 16:56:01 2016
[1455206156.155797] [008.1] [pid=63705] Next Low Priority Event Time:  Thu Feb 11 16:55:56 2016
[1455206156.155800] [008.1] [pid=63705] Current/Max Service Checks: 2/0
[1455206156.155806] [008.1] [pid=63705] Running event...
[1455206156.155809] [001.0] [pid=63705] handle_timed_event() start
[1455206156.155812] [064.1] [pid=63705] Making callbacks (type 8)...
[1455206156.155815] [064.2] [pid=63705] Callback #1 (type 8) return code = 0
[1455206156.155820] [008.0] [pid=63705] ** Timed Event ** Type: EVENT_SERVICE_CHECK, Run Time: Thu Feb 11 16:55:56 2016
[1455206156.155830] [008.0] [pid=63705] ** Service Check Event ==> Host: 'hostname', Service: 'system-info', Options: 0, Latency: 0.155000 sec
[1455206156.155835] [001.0] [pid=63705] run_scheduled_service_check() start
[1455206156.155838] [016.0] [pid=63705] Attempting to run scheduled check of service 'system-info' on host 'hostname': check options=0, latency=0.155000
[1455206156.155845] [001.0] [pid=63705] run_async_service_check()
[1455206156.155849] [001.0] [pid=63705] check_service_check_viability()
[1455206156.155865] [001.0] [pid=63705] check_time_against_period()
[1455206156.155871] [001.0] [pid=63705] check_service_dependencies()
[1455206156.155876] [064.1] [pid=63705] Making callbacks (type 13)...
[1455206156.155880] [064.2] [pid=63705] Callback #1 (type 13) return code = 0
[1455206156.155883] [016.0] [pid=63705] Checking service 'system-info' on host 'hostname'...
[1455206156.155893] [001.0] [pid=63705] get_raw_command_line_r()
[1455206156.155897] [2320.2] [pid=63705] Raw Command Input: /opt/nagios/libexec/plugin -h '$HOSTNAME$'  -s '$SERVICEDESC$' -Cmd  'snmpget -Oqvt $HOSTADDRESS$ -v 1 -c $_HOSTCOMMUNITY$ .1.3.6.1.2.1.1.1.0 | head -1 && snmpg
et -Oqvt $HOSTADDRESS$ -v 1 -c $_HOSTCOMMUNITY$ .1.3.6.1.2.1.1.3.0' -q "n.a." -Text '@1<br>uptime @sprintf ("%.2f days",@2/(24*3600*100))' -CsvFields '@sprintf ("%.2f",@2/(24*3600*100))'
[1455206156.155901] [2320.2] [pid=63705] Expanded Command Output: /opt/nagios/libexec/plugin -h '$HOSTNAME$'  -s '$SERVICEDESC$' -Cmd  'snmpget -Oqvt $HOSTADDRESS$ -v 1 -c $_HOSTCOMMUNITY$ .1.3.6.1.2.1.1.1.0 | head -1 &&
 snmpget -Oqvt $HOSTADDRESS$ -v 1 -c $_HOSTCOMMUNITY$ .1.3.6.1.2.1.1.3.0' -q "n.a." -Text '@1<br>uptime @sprintf ("%.2f days",@2/(24*3600*100))' -CsvFields '@sprintf ("%.2f",@2/(24*3600*100))'
[1455206156.155904] [001.0] [pid=63705] process_macros_r()
[1455206156.155907] [2048.1] [pid=63705] **** BEGIN MACRO PROCESSING ***********
[1455206156.155910] [2048.1] [pid=63705] Processing: '/opt/nagios/libexec/plugin -h '$HOSTNAME$'  -s '$SERVICEDESC$' -Cmd  'snmpget -Oqvt $HOSTADDRESS$ -v 1 -c $_HOSTCOMMUNITY$ .1.3.6.1.2.1.1.1.0 | head -1 && snmpget -Oq
vt $HOSTADDRESS$ -v 1 -c $_HOSTCOMMUNITY$ .1.3.6.1.2.1.1.3.0' -q "n.a." -Text '@1<br>uptime @sprintf ("%.2f days",@2/(24*3600*100))' -CsvFields '@sprintf ("%.2f",@2/(24*3600*100))''
[1455206156.155913] [2048.2] [pid=63705]   Processing part: '/opt/nagios/libexec/plugin -h ''
[1455206156.155918] [2048.2] [pid=63705]   Not currently in macro.  Running output (48): '/opt/nagios/libexec/plugin -h ''
[1455206156.155921] [2048.2] [pid=63705]   Processing part: 'HOSTNAME'
[1455206156.155926] [2048.2] [pid=63705]   macros[0] (HOSTNAME) match.
[1455206156.155931] [2048.2] [pid=63705]   Processed 'HOSTNAME', Clean Options: 0, Free: 1
[1455206156.155934] [2048.2] [pid=63705]   Processed 'HOSTNAME', Clean Options: 0, Free: 1
[1455206156.155936] [2048.2] [pid=63705]   Cleaning options: global=0, local=0, effective=0
[1455206156.155940] [2048.2] [pid=63705]   Uncleaned macro.  Running output (63): '/opt/nagios/libexec/plugin -h 'hostname'
[1455206156.155943] [2048.2] [pid=63705]   Just finished macro.  Running output (63): '/opt/nagios/libexec/plugin -h 'hostname'
[1455206156.155946] [2048.2] [pid=63705]   Processing part: ''  -s ''
[1455206156.155948] [2048.2] [pid=63705]   Not currently in macro.  Running output (70): '/opt/nagios/libexec/plugin -h 'hostname'  -s ''
[1455206156.155951] [2048.2] [pid=63705]   Processing part: 'SERVICEDESC'
[1455206156.155955] [2048.2] [pid=63705]   macros[3] (SERVICEDESC) match.
[1455206156.155958] [2048.2] [pid=63705]   Processed 'SERVICEDESC', Clean Options: 0, Free: 1
[1455206156.155961] [2048.2] [pid=63705]   Processed 'SERVICEDESC', Clean Options: 0, Free: 1
[1455206156.155964] [2048.2] [pid=63705]   Cleaning options: global=0, local=0, effective=0
[1455206156.155967] [2048.2] [pid=63705]   Uncleaned macro.  Running output (81): '/opt/nagios/libexec/plugin -h 'hostname'  -s 'system-info'
[1455206156.155970] [2048.2] [pid=63705]   Just finished macro.  Running output (81): '/opt/nagios/libexec/plugin -h 'hostname'  -s 'system-info'
[1455206156.155972] [2048.2] [pid=63705]   Processing part: '' -Cmd  'snmpget -Oqvt '
[1455206156.155975] [2048.2] [pid=63705]   Not currently in macro.  Running output (104): '/opt/nagios/libexec/plugin -h 'hostname'  -s 'system-info' -Cmd  'snmpget -Oqvt '
[1455206156.155985] [2048.2] [pid=63705]   Processing part: 'HOSTADDRESS'
[1455206156.155988] [2048.2] [pid=63705]   Processed 'HOSTADDRESS', Clean Options: 0, Free: 0
[1455206156.155991] [2048.2] [pid=63705]   Processed 'HOSTADDRESS', Clean Options: 0, Free: 0
[1455206156.155994] [2048.2] [pid=63705]   Cleaning options: global=0, local=0, effective=0
[1455206156.155997] [2048.2] [pid=63705]   Uncleaned macro.  Running output (119): '/opt/nagios/libexec/plugin -h 'hostname'  -s 'system-info' -Cmd  'snmpget -Oqvt hostname'
[1455206156.156000] [2048.2] [pid=63705]   Just finished macro.  Running output (119): '/opt/nagios/libexec/plugin -h 'hostname'  -s 'system-info' -Cmd  'snmpget -Oqvt hostname'
[1455206156.156003] [2048.2] [pid=63705]   Processing part: ' -v 1 -c '
[1455206156.156005] [2048.2] [pid=63705]   Not currently in macro.  Running output (128): '/opt/nagios/libexec/plugin -h 'hostname'  -s 'system-info' -Cmd  'snmpget -Oqvt hostname -v 1 -c '
[1455206156.156008] [2048.2] [pid=63705]   Processing part: '_HOSTCOMMUNITY'
[1455206156.156012] [2048.2] [pid=63705]   Processed '_HOSTCOMMUNITY', Clean Options: 0, Free: 1
[1455206156.156015] [2048.2] [pid=63705]   Processed '_HOSTCOMMUNITY', Clean Options: 0, Free: 1
[1455206156.156018] [2048.2] [pid=63705]   Cleaning options: global=0, local=0, effective=0
[1455206156.156021] [2048.2] [pid=63705]   Uncleaned macro.  Running output (141): '/opt/nagios/libexec/plugin -h 'hostname'  -s 'system-info' -Cmd  'snmpget -Oqvt hostname -v 1 -c public'
[1455206156.156024] [2048.2] [pid=63705]   Just finished macro.  Running output (141): '/opt/nagios/libexec/plugin -h 'hostname'  -s 'system-info' -Cmd  'snmpget -Oqvt hostname -v 1 -c public'
[1455206156.156026] [2048.2] [pid=63705]   Processing part: ' .1.3.6.1.2.1.1.1.0 | head -1 && snmpget -Oqvt '
[1455206156.156030] [2048.2] [pid=63705]   Not currently in macro.  Running output (188): '/opt/nagios/libexec/plugin -h 'hostname'  -s 'system-info' -Cmd  'snmpget -Oqvt hostname -v 1 -c public .1.3
.6.1.2.1.1.1.0 | head -1 && snmpget -Oqvt '
[1455206156.156033] [2048.2] [pid=63705]   Processing part: 'HOSTADDRESS'
[1455206156.156035] [2048.2] [pid=63705]   Processed 'HOSTADDRESS', Clean Options: 0, Free: 0
[1455206156.156038] [2048.2] [pid=63705]   Processed 'HOSTADDRESS', Clean Options: 0, Free: 0
[1455206156.156041] [2048.2] [pid=63705]   Cleaning options: global=0, local=0, effective=0
[1455206156.156044] [2048.2] [pid=63705]   Uncleaned macro.  Running output (203): '/opt/nagios/libexec/plugin -h 'hostname'  -s 'system-info' -Cmd  'snmpget -Oqvt hostname -v 1 -c public .1.3.6.1.2.
1.1.1.0 | head -1 && snmpget -Oqvt hostname'
[1455206156.156047] [2048.2] [pid=63705]   Just finished macro.  Running output (203): '/opt/nagios/libexec/plugin -h 'hostname'  -s 'system-info' -Cmd  'snmpget -Oqvt hostname -v 1 -c public .1.3.6.
1.2.1.1.1.0 | head -1 && snmpget -Oqvt hostname'
[1455206156.156049] [2048.2] [pid=63705]   Processing part: ' -v 1 -c '
[1455206156.156052] [2048.2] [pid=63705]   Not currently in macro.  Running output (212): '/opt/nagios/libexec/plugin -h 'hostname'  -s 'system-info' -Cmd  'snmpget -Oqvt hostname -v 1 -c public .1.3
.6.1.2.1.1.1.0 | head -1 && snmpget -Oqvt hostname -v 1 -c '
[1455206156.156055] [2048.2] [pid=63705]   Processing part: '_HOSTCOMMUNITY'
[1455206156.156058] [2048.2] [pid=63705]   Processed '_HOSTCOMMUNITY', Clean Options: 0, Free: 1
[1455206156.156061] [2048.2] [pid=63705]   Processed '_HOSTCOMMUNITY', Clean Options: 0, Free: 1
[1455206156.156063] [2048.2] [pid=63705]   Cleaning options: global=0, local=0, effective=0
[1455206156.156066] [2048.2] [pid=63705]   Uncleaned macro.  Running output (225): '/opt/nagios/libexec/plugin -h 'hostname'  -s 'system-info' -Cmd  'snmpget -Oqvt hostname -v 1 -c public .1.3.6.1.2.
1.1.1.0 | head -1 && snmpget -Oqvt hostname -v 1 -c public'
[1455206156.156075] [2048.2] [pid=63705]   Just finished macro.  Running output (225): '/opt/nagios/libexec/plugin -h 'hostname'  -s 'system-info' -Cmd  'snmpget -Oqvt hostname -v 1 -c public .1.3.6.
1.2.1.1.1.0 | head -1 && snmpget -Oqvt hostname -v 1 -c public'
[1455206156.156078] [2048.2] [pid=63705]   Processing part: ' .1.3.6.1.2.1.1.3.0' -q "n.a." -Text '@1<br>uptime @sprintf ("%.2f days",@2/(24*3600*100))' -CsvFields '@sprintf ("%.2f",@2/(24*3600*100))''
[1455206156.156082] [2048.2] [pid=63705]   Not currently in macro.  Running output (364): '/opt/nagios/libexec/plugin -h 'hostname'  -s 'system-info' -Cmd  'snmpget -Oqvt hostname -v 1 -c public .1.3
.6.1.2.1.1.1.0 | head -1 && snmpget -Oqvt hostname -v 1 -c public .1.3.6.1.2.1.1.3.0' -q "n.a." -Text '@1<br>uptime @sprintf ("%.2f days",@2/(24*3600*100))' -CsvFields '@sprintf ("%.2f",@2/(24*3600*100))''
[1455206156.156085] [2048.1] [pid=63705]   Done.  Final output: '/opt/nagios/libexec/plugin -h 'hostname'  -s 'system-info' -Cmd  'snmpget -Oqvt hostname -v 1 -c public .1.3.6.1.2.1.1.1.0 | head -1 &
& snmpget -Oqvt hostname -v 1 -c public .1.3.6.1.2.1.1.3.0' -q "n.a." -Text '@1<br>uptime @sprintf ("%.2f days",@2/(24*3600*100))' -CsvFields '@sprintf ("%.2f",@2/(24*3600*100))''
[1455206156.156088] [2048.1] [pid=63705] **** END MACRO PROCESSING *************
[1455206156.156091] [064.1] [pid=63705] Making callbacks (type 13)...
[1455206156.156094] [064.2] [pid=63705] Callback #1 (type 13) return code = 0
[1455206156.156111] [016.1] [pid=63705] Check result output will be written to '/dev/shm/checkLBqvPm' (fd=8)
[1455206156.157052] [016.2] [pid=63705] Service check is executing in child process (pid=20103)
[1455206156.157091] [001.0] [pid=63705] handle_timed_event() end
[1455206156.157098] [008.1] [pid=63705] ** Event Check Loop
[1455206156.157111] [008.1] [pid=63705] Next High Priority Event Time: Thu Feb 11 16:56:01 2016
[1455206156.157117] [008.1] [pid=63705] Next Low Priority Event Time:  Thu Feb 11 16:55:56 2016
[1455206156.157120] [008.1] [pid=63705] Current/Max Service Checks: 3/0
[1455206156.157125] [024.1] [pid=63705] We're not executing host checks right now, so we'll skip this event.
[1455206156.157128] [001.0] [pid=63705] remove_event()
[1455206156.157131] [064.1] [pid=63705] Making callbacks (type 8)...
[1455206156.157139] [064.2] [pid=63705] Callback #1 (type 8) return code = 0
[1455206156.157150] [001.0] [pid=63705] reschedule_event()
[1455206156.157153] [001.0] [pid=63705] add_event()
[1455206156.157156] [064.1] [pid=63705] Making callbacks (type 8)...
[1455206156.157159] [064.2] [pid=63705] Callback #1 (type 8) return code = 0
[1455206156.157162] [064.1] [pid=63705] Making callbacks (type 19)...
[1455206156.157165] [064.2] [pid=63705] Callback #1 (type 19) return code = 0
[1455206156.157168] [008.1] [pid=63705] ** Event Check Loop
[1455206156.157173] [008.1] [pid=63705] Next High Priority Event Time: Thu Feb 11 16:56:01 2016
[1455206156.157178] [008.1] [pid=63705] Next Low Priority Event Time:  Thu Feb 11 16:55:56 2016
[1455206156.157181] [008.1] [pid=63705] Current/Max Service Checks: 3/0
[1455206156.157185] [024.1] [pid=63705] We're not executing host checks right now, so we'll skip this event.
[1455206156.157188] [001.0] [pid=63705] remove_event()
[1455206156.157190] [064.1] [pid=63705] Making callbacks (type 8)...
[1455206156.157193] [064.2] [pid=63705] Callback #1 (type 8) return code = 0
[1455206156.157196] [001.0] [pid=63705] reschedule_event()
[1455206156.157198] [001.0] [pid=63705] add_event()
[1455206156.157201] [064.1] [pid=63705] Making callbacks (type 8)...
[1455206156.157204] [064.2] [pid=63705] Callback #1 (type 8) return code = 0
[1455206156.157207] [064.1] [pid=63705] Making callbacks (type 19)...
[1455206156.157218] [064.2] [pid=63705] Callback #1 (type 19) return code = 0
[1455206156.157221] [008.1] [pid=63705] ** Event Check Loop
[1455206156.157226] [008.1] [pid=63705] Next High Priority Event Time: Thu Feb 11 16:56:01 2016
[1455206156.157231] [008.1] [pid=63705] Next Low Priority Event Time:  Thu Feb 11 16:55:56 2016
[1455206156.157234] [008.1] [pid=63705] Current/Max Service Checks: 3/0
[1455206156.157238] [024.1] [pid=63705] We're not executing host checks right now, so we'll skip this event.
[1455206156.157241] [001.0] [pid=63705] remove_event()
[1455206156.157250] [064.1] [pid=63705] Making callbacks (type 8)...
[1455206156.157252] [064.2] [pid=63705] Callback #1 (type 8) return code = 0
[1455206156.157255] [001.0] [pid=63705] reschedule_event()
[1455206156.157258] [001.0] [pid=63705] add_event()
[1455206156.157260] [064.1] [pid=63705] Making callbacks (type 8)...
[1455206156.157263] [064.2] [pid=63705] Callback #1 (type 8) return code = 0
[1455206156.157266] [064.1] [pid=63705] Making callbacks (type 19)...
[1455206156.157268] [064.2] [pid=63705] Callback #1 (type 19) return code = 0
[1455206156.157272] [008.1] [pid=63705] ** Event Check Loop
[1455206156.157278] [008.1] [pid=63705] Next High Priority Event Time: Thu Feb 11 16:56:01 2016
[1455206156.157283] [008.1] [pid=63705] Next Low Priority Event Time:  Thu Feb 11 16:55:56 2016
[1455206156.157286] [008.1] [pid=63705] Current/Max Service Checks: 3/0
[1455206156.157289] [024.1] [pid=63705] We're not executing host checks right now, so we'll skip this event.
[1455206156.157292] [001.0] [pid=63705] remove_event()
[1455206156.157294] [064.1] [pid=63705] Making callbacks (type 8)...
[1455206156.157297] [064.2] [pid=63705] Callback #1 (type 8) return code = 0
[1455206156.157300] [001.0] [pid=63705] reschedule_event()
[1455206156.157302] [001.0] [pid=63705] add_event()
[1455206156.157305] [064.1] [pid=63705] Making callbacks (type 8)...
[1455206156.157307] [064.2] [pid=63705] Callback #1 (type 8) return code = 0
[1455206156.157310] [064.1] [pid=63705] Making callbacks (type 19)...
[1455206156.157313] [064.2] [pid=63705] Callback #1 (type 19) return code = 0
[1455206156.157315] [008.1] [pid=63705] ** Event Check Loop
[1455206156.157320] [008.1] [pid=63705] Next High Priority Event Time: Thu Feb 11 16:56:01 2016
[1455206156.157325] [008.1] [pid=63705] Next Low Priority Event Time:  Thu Feb 11 16:55:56 2016
[1455206156.157328] [008.1] [pid=63705] Current/Max Service Checks: 3/0
[1455206156.157331] [024.1] [pid=63705] We're not executing host checks right now, so we'll skip this event.
[1455206156.157334] [001.0] [pid=63705] remove_event()
[1455206156.157337] [064.1] [pid=63705] Making callbacks (type 8)...
[1455206156.157339] [064.2] [pid=63705] Callback #1 (type 8) return code = 0
[1455206156.157342] [001.0] [pid=63705] reschedule_event()
[1455206156.157344] [001.0] [pid=63705] add_event()
[1455206156.157347] [064.1] [pid=63705] Making callbacks (type 8)...
[1455206156.157350] [064.2] [pid=63705] Callback #1 (type 8) return code = 0
[1455206156.157352] [064.1] [pid=63705] Making callbacks (type 19)...
[1455206156.157355] [064.2] [pid=63705] Callback #1 (type 19) return code = 0
Last edited by hsmith on Fri Feb 12, 2016 10:43 am, edited 1 time in total.
Reason: Changed [quote] to [code] for readability.
User avatar
hsmith
Agent Smith
Posts: 3539
Joined: Thu Jul 30, 2015 11:09 am
Location: 127.0.0.1
Contact:

Re: Nagios stops working

Post by hsmith »

If you're able to disable livestatus, that would be a good test to run. I did find an interesting email chain about this you may want to take a look at : https://www.mail-archive.com/nagios-use ... 23059.html
Former Nagios Employee.
me.
bushi
Posts: 9
Joined: Wed May 26, 2010 10:36 am

Re: Nagios stops working

Post by bushi »

Nagios run through the whole weeking without problems.
So deleting objects.cache and also the retention file solved the problem for me. As far as i remember i had this behavior already some years ago - ill note to delete those files when performing big configuration changes.

Thanks for your help.
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Nagios stops working

Post by tmcdonald »

I'll be closing this thread now, but feel free to open another if you need anything in the future!
Former Nagios employee
Locked