Hello,
I'm not sure I've seen anything 100% relevant so posting this issue, apologies if I haven't found the same topic:
# ls /tmp| wc
750507 750507 9006143
full of check files, I noticed the other day all checks had stopped and cleaned up/adjusted tmpwatch to clean up files 1 day and over, but this is still a pretty big hit.
Where can I look to see why these aren't being cleaned up?
Regards
check files in /tmp
Re: check files in /tmp
actually I'm guessing I need to try the instructions here first
http://support.nagios.com/wiki/index.ph ... g_Orphaned
http://support.nagios.com/wiki/index.ph ... g_Orphaned
-
slansing
- Posts: 7698
- Joined: Mon Apr 23, 2012 4:28 pm
- Location: Travelling through time and space...
Re: check files in /tmp
Actually, these are left over when nagios runs a check, and they should be removed when the check returns. Are you noticing oddities in nagios? checks not completing or coming back? What does the nagios.log show? It is possible you are seeing orphaned checks:
Code: Select all
tail -50 /usr/local/nagios/var/nagios.logRe: check files in /tmp
yes, exactly why I was looking at this link
$ grep -i orphan /var/log/messages | wc
230176 6744091 51654124
I've implemented the other changes, but I'm slightly hesitant about the perl changes, can I estimate a performance hit?
I've been monitoring /tmp and while I have only a few hundred right now, I note that most of them are from last night at the same time
-rw------- 1 nagios nagios 282 Jun 11 02:41 checklWk4b3
I don't however see any orphaned messages since I made these changes:
Edit /etc/security/limits.conf
* hard memlock 128 #locked memory
* soft memlock 128
* soft nofile 4096 #open files
* hard nofile 4096
* hard nproc 4096 #max user processes
* soft nproc 4096
* hard stack 20480 #stack size
* soft stack 20480
these are the last ones:
2014-06-10T17:11:13.130213+02:00 xxxxxxxxxxxxx nagios: Warning: The check of service 'OS Kernel Version' on host 'xxxxxxxxxxxx' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service...
2014-06-10T17:11:13.130395+02:00 xxxxxxxxxxxxx nagios: Warning: The check of service 'Solaris fmadm check' on host 'xxxxxxxxxxx' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service...
just before I made these changes, and restarted
$ grep -i orphan /var/log/messages | wc
230176 6744091 51654124
I've implemented the other changes, but I'm slightly hesitant about the perl changes, can I estimate a performance hit?
I've been monitoring /tmp and while I have only a few hundred right now, I note that most of them are from last night at the same time
-rw------- 1 nagios nagios 282 Jun 11 02:41 checklWk4b3
I don't however see any orphaned messages since I made these changes:
Edit /etc/security/limits.conf
* hard memlock 128 #locked memory
* soft memlock 128
* soft nofile 4096 #open files
* hard nofile 4096
* hard nproc 4096 #max user processes
* soft nproc 4096
* hard stack 20480 #stack size
* soft stack 20480
these are the last ones:
2014-06-10T17:11:13.130213+02:00 xxxxxxxxxxxxx nagios: Warning: The check of service 'OS Kernel Version' on host 'xxxxxxxxxxxx' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service...
2014-06-10T17:11:13.130395+02:00 xxxxxxxxxxxxx nagios: Warning: The check of service 'Solaris fmadm check' on host 'xxxxxxxxxxx' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service...
just before I made these changes, and restarted
Re: check files in /tmp
It's possible that nagios is not exiting in a timely manner. This can cause bunch of tmp files in the "/tmp" directory. Please, read our FAQ wiki post on the issue here:
http://support.nagios.com/wiki/index.ph ... ely_manner
http://support.nagios.com/wiki/index.ph ... ely_manner
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: check files in /tmp
I'll have to double check, I did put in a sleep somewhere there.
Thing is what would cause a restart other than from command-line or gui?
The last audit log entry was about 30 minutes before the 02:40 accumulation of check files.
Thing is what would cause a restart other than from command-line or gui?
The last audit log entry was about 30 minutes before the 02:40 accumulation of check files.
-
slansing
- Posts: 7698
- Joined: Mon Apr 23, 2012 4:28 pm
- Location: Travelling through time and space...
Re: check files in /tmp
Not quite sure what you mean by this:
Sounds like you just needed to make the orphaning changes mentioned on that page. Let us know if you need further help, or if we are good to lock this thread.Thing is what would cause a restart other than from command-line or gui?
Re: check files in /tmp
Yeah, thanks. It's not completely resolved but certainly well under control.
Regards
Regards
Re: check files in /tmp
I'll keep this thread open for a bit longer in case this gets worse.
Former Nagios employee