Page 1 of 1

check files in /tmp

Posted: Tue Jun 10, 2014 7:41 am
by iivanyi
Hello,

I'm not sure I've seen anything 100% relevant so posting this issue, apologies if I haven't found the same topic:

# ls /tmp| wc
750507 750507 9006143


full of check files, I noticed the other day all checks had stopped and cleaned up/adjusted tmpwatch to clean up files 1 day and over, but this is still a pretty big hit.


Where can I look to see why these aren't being cleaned up?

Regards

Re: check files in /tmp

Posted: Tue Jun 10, 2014 7:46 am
by iivanyi
actually I'm guessing I need to try the instructions here first
http://support.nagios.com/wiki/index.ph ... g_Orphaned

Re: check files in /tmp

Posted: Tue Jun 10, 2014 1:34 pm
by slansing
Actually, these are left over when nagios runs a check, and they should be removed when the check returns. Are you noticing oddities in nagios? checks not completing or coming back? What does the nagios.log show? It is possible you are seeing orphaned checks:

Code: Select all

tail -50 /usr/local/nagios/var/nagios.log

Re: check files in /tmp

Posted: Wed Jun 11, 2014 4:18 am
by iivanyi
yes, exactly why I was looking at this link

$ grep -i orphan /var/log/messages | wc
230176 6744091 51654124

I've implemented the other changes, but I'm slightly hesitant about the perl changes, can I estimate a performance hit?

I've been monitoring /tmp and while I have only a few hundred right now, I note that most of them are from last night at the same time

-rw------- 1 nagios nagios 282 Jun 11 02:41 checklWk4b3

I don't however see any orphaned messages since I made these changes:
Edit /etc/security/limits.conf

* hard memlock 128 #locked memory
* soft memlock 128

* soft nofile 4096 #open files
* hard nofile 4096

* hard nproc 4096 #max user processes
* soft nproc 4096

* hard stack 20480 #stack size
* soft stack 20480


these are the last ones:
2014-06-10T17:11:13.130213+02:00 xxxxxxxxxxxxx nagios: Warning: The check of service 'OS Kernel Version' on host 'xxxxxxxxxxxx' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service...
2014-06-10T17:11:13.130395+02:00 xxxxxxxxxxxxx nagios: Warning: The check of service 'Solaris fmadm check' on host 'xxxxxxxxxxx' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service...

just before I made these changes, and restarted

Re: check files in /tmp

Posted: Wed Jun 11, 2014 12:35 pm
by lmiltchev
It's possible that nagios is not exiting in a timely manner. This can cause bunch of tmp files in the "/tmp" directory. Please, read our FAQ wiki post on the issue here:

http://support.nagios.com/wiki/index.ph ... ely_manner

Re: check files in /tmp

Posted: Wed Jun 11, 2014 12:49 pm
by citcosa
I'll have to double check, I did put in a sleep somewhere there.

Thing is what would cause a restart other than from command-line or gui?

The last audit log entry was about 30 minutes before the 02:40 accumulation of check files.

Re: check files in /tmp

Posted: Thu Jun 12, 2014 9:09 am
by slansing
Not quite sure what you mean by this:
Thing is what would cause a restart other than from command-line or gui?
Sounds like you just needed to make the orphaning changes mentioned on that page. Let us know if you need further help, or if we are good to lock this thread.

Re: check files in /tmp

Posted: Fri Jun 13, 2014 2:28 am
by iivanyi
Yeah, thanks. It's not completely resolved but certainly well under control.

Regards

Re: check files in /tmp

Posted: Fri Jun 13, 2014 9:34 am
by tmcdonald
I'll keep this thread open for a bit longer in case this gets worse.