Page 1 of 2

"Last Check" column on web-interface

Posted: Fri Nov 07, 2014 1:31 pm
by vvz
Hello!
Nagios is running about one year and everything is OK, it does the job.
A few days ago I remarked that in "Last Check" column on web-interface the values are quite old.
If my current time is 15.15 I have like 14.26 value for 80% of services running (and even older).
But if I open Service State Information window for the service "Last Update:" value is very close to current time.
"Perfomance Data: line is empty for third party plugins using NRPE, for plugins came with nagios - everything as it should be.

I already had this problem a few months ago and I solved the problem next way - stopped nagios, deleted all files in checkresults folder, started nagios, and for about 10 minutes deleted all files appeared in checkresults folder.
I don't know was it the right way to solve the problem or I was just lucky but problem was solved.

Can you suggest any other way ?

And I remarked that this problem arised if I restart nagios very often (I did that for some tests last week).
Thank you for your help.

Re: "Last Check" column on web-interface

Posted: Fri Nov 07, 2014 3:18 pm
by abrist
What version of nagios are you running?

Re: "Last Check" column on web-interface

Posted: Fri Nov 07, 2014 3:23 pm
by vvz
nagios.x86_64 3.5.1-1.el6

Re: "Last Check" column on web-interface

Posted: Fri Nov 07, 2014 3:37 pm
by abrist
Can you post a large tail of you nagios log in code wraps?

Code: Select all

tail -50 /usr/local/nagios/var/nagios.log

Re: "Last Check" column on web-interface

Posted: Fri Nov 07, 2014 3:41 pm
by vvz

Code: Select all

[1415392591] Warning: The check of service 'Processors condition' on host 'smg2-condor-site' looks like it was orphaned (results never came back).  I'm scheduling an immediate check of the service...
[1415392591] Warning: The check of service 'TEMPERATURE condition' on host 'smg2-condor-site' looks like it was orphaned (results never came back).  I'm scheduling an immediate check of the service...
[1415392591] Warning: The check of service 'check-host-alive-or-not' on host 'smg2-condor-site' looks like it was orphaned (results never came back).  I'm scheduling an immediate check of the service...
[1415392591] Warning: The check of service 'check-smg-process-on-vnode' on host 'vnode2-concert-site' looks like it was orphaned (results never came back).  I'm scheduling an immediate check of the service...
[1415392624] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;vnode1-concert-site;check-vnode-log-for-errors;0;no ERROR's
[1415392625] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;vnode2-concert-site;check-vnode-log-for-errors;0;no ERROR's
[1415392628] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;smg1-condor-site;concert-probe-alive;0;There are 1386 crt-probe2 messages in log file
[1415392630] PASSIVE SERVICE CHECK: vnode1-concert-site;check-vnode-log-for-errors;0;no ERROR's
[1415392630] PASSIVE SERVICE CHECK: vnode2-concert-site;check-vnode-log-for-errors;0;no ERROR's
[1415392630] PASSIVE SERVICE CHECK: smg1-condor-site;concert-probe-alive;0;There are 1386 crt-probe2 messages in log file
[1415392632] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;smg1-condor-site;check-smg-out-passive;0;smg sent 4921 messages
[1415392638] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;smg1-condor-site;condor-probe-alive;0;There are 1689 dor-probe2 messages in log file
[1415392640] PASSIVE SERVICE CHECK: smg1-condor-site;check-smg-out-passive;0;smg sent 4921 messages
[1415392640] PASSIVE SERVICE CHECK: smg1-condor-site;condor-probe-alive;0;There are 1689 dor-probe2 messages in log file
[1415392648] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;smg1-condor-site;elsalto-probe-alive;0;There are 4218 tmx-probe2 messages in log file
[1415392650] PASSIVE SERVICE CHECK: smg1-condor-site;elsalto-probe-alive;0;There are 4218 tmx-probe2 messages in log file
[1415392651] Warning: The check of service 'TEMPERATURE condition' on host 'billing-concert-nagios-server' looks like it was orphaned (results never came back).  I'm scheduling an immediate check of the service...
[1415392651] Warning: The check of service 'FANs condition' on host 'nagios-server' looks like it was orphaned (results never came back).  I'm scheduling an immediate check of the service...
[1415392651] Warning: The check of service 'PowerSupply condition' on host 'probe1-concert-site' looks like it was orphaned (results never came back).  I'm scheduling an immediate check of the service...
[1415392651] Warning: The check of service 'Root Partition' on host 'probe1-concert-site' looks like it was orphaned (results never came back).  I'm scheduling an immediate check of the service...
[1415392651] Warning: The check of service 'Root Partition' on host 'probe1-condor-site' looks like it was orphaned (results never came back).  I'm scheduling an immediate check of the service...
[1415392651] Warning: The check of service 'DIMM condition' on host 'probe2-concert-site' looks like it was orphaned (results never came back).  I'm scheduling an immediate check of the service...
[1415392651] Warning: The check of service 'PowerSupply condition' on host 'probe2-concert-site' looks like it was orphaned (results never came back).  I'm scheduling an immediate check of the service...
[1415392651] Warning: The check of service 'Current Load' on host 'smg1-condor-site' looks like it was orphaned (results never came back).  I'm scheduling an immediate check of the service...
[1415392651] Warning: The check of service 'Processors condition' on host 'smg1-condor-site' looks like it was orphaned (results never came back).  I'm scheduling an immediate check of the service...
[1415392684] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;vnode1-concert-site;check-vnode-log-for-errors;0;no ERROR's
[1415392685] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;vnode2-concert-site;check-vnode-log-for-errors;0;no ERROR's
[1415392688] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;smg1-condor-site;concert-probe-alive;0;There are 1291 crt-probe2 messages in log file
[1415392690] PASSIVE SERVICE CHECK: vnode1-concert-site;check-vnode-log-for-errors;0;no ERROR's
[1415392690] PASSIVE SERVICE CHECK: vnode2-concert-site;check-vnode-log-for-errors;0;no ERROR's
[1415392690] PASSIVE SERVICE CHECK: smg1-condor-site;concert-probe-alive;0;There are 1291 crt-probe2 messages in log file
[1415392692] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;smg1-condor-site;check-smg-out-passive;0;smg sent 4861 messages
[1415392698] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;smg1-condor-site;condor-probe-alive;0;There are 1702 dor-probe2 messages in log file
[1415392700] PASSIVE SERVICE CHECK: smg1-condor-site;check-smg-out-passive;0;smg sent 4861 messages
[1415392700] PASSIVE SERVICE CHECK: smg1-condor-site;condor-probe-alive;0;There are 1702 dor-probe2 messages in log file
[1415392708] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;smg1-condor-site;elsalto-probe-alive;0;There are 4210 tmx-probe2 messages in log file
[1415392710] PASSIVE SERVICE CHECK: smg1-condor-site;elsalto-probe-alive;0;There are 4210 tmx-probe2 messages in log file
[1415392744] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;vnode1-concert-site;check-vnode-log-for-errors;0;no ERROR's
[1415392745] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;vnode2-concert-site;check-vnode-log-for-errors;0;no ERROR's
[1415392748] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;smg1-condor-site;concert-probe-alive;0;There are 1319 crt-probe2 messages in log file
[1415392750] PASSIVE SERVICE CHECK: vnode1-concert-site;check-vnode-log-for-errors;0;no ERROR's
[1415392750] PASSIVE SERVICE CHECK: vnode2-concert-site;check-vnode-log-for-errors;0;no ERROR's
[1415392750] PASSIVE SERVICE CHECK: smg1-condor-site;concert-probe-alive;0;There are 1319 crt-probe2 messages in log file
[1415392752] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;smg1-condor-site;check-smg-out-passive;0;smg sent 4736 messages
[1415392758] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;smg1-condor-site;condor-probe-alive;0;There are 1760 dor-probe2 messages in log file
[1415392760] PASSIVE SERVICE CHECK: smg1-condor-site;check-smg-out-passive;0;smg sent 4736 messages
[1415392760] PASSIVE SERVICE CHECK: smg1-condor-site;condor-probe-alive;0;There are 1760 dor-probe2 messages in log file
[1415392768] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;smg1-condor-site;elsalto-probe-alive;0;There are 4148 tmx-probe2 messages in log file
[1415392770] PASSIVE SERVICE CHECK: smg1-condor-site;elsalto-probe-alive;0;There are 4148 tmx-probe2 messages in log file
[1415392771] Warning: The check of service 'Current Load' on host 'billing-condor-site' looks like it was orphaned (results never came back).  I'm scheduling an immediate check of the service...



Re: "Last Check" column on web-interface

Posted: Fri Nov 07, 2014 3:48 pm
by vvz
I found one solution in google groups. that's what they suggest:
next time this happens, try to set use_retained_scheduling_info to 0 and restart nagios, then set it back to 1.
I have a nagios installation where I sometimes get orphan checks as well, witg 24k services.
It seems to me nagios gets lost in the scheduling info cache with this amount of services.
Can it be a permanent solution?

Re: "Last Check" column on web-interface

Posted: Fri Nov 07, 2014 3:52 pm
by vvz
I've already tried to put use_retained_scheduling_info to 0 and - yes, now I have all times updated to current one.
but as soon as I put it back to 1 - again , old values in "Last check column"

Actually I don't need stats so I could live this parameter to 0

Re: "Last Check" column on web-interface

Posted: Mon Nov 10, 2014 3:41 pm
by abrist
Orphaned checks may also be caused by kernel limitations. Even though you are using core, the following XI FAQ still applies:
http://support.nagios.com/wiki/index.ph ... g_Orphaned

Re: "Last Check" column on web-interface

Posted: Mon Nov 10, 2014 3:48 pm
by vvz
Thank you. Let you know about.

Re: "Last Check" column on web-interface

Posted: Mon Nov 10, 2014 3:51 pm
by abrist
Great, keep us posted.