Super large cleaner.log file made my server crash
Posted: Mon Jul 07, 2014 3:14 am
Hello,
Apparently this weekend our Nagios production server crashed, since a file in /usr/local:nagiosxi/var named cleaner.log took all available disk space. After deleting the file, the disk space was not automatically freed. We tried restarting httpd, postgresql, mysqld, initd and nagios service, but the used disk space did not became available, so I had to reboot the server (which did seem to free up the used disk space) and had to execute the mysql repair script in order to make Nagios XI work again.
Could I please get some help in finding out what the reason was that this file grew so excessively? Se screenshot for more details..
Df -h after reboot.
I'm not 100 % sure, but it might have something to do with the backup script, as I did have to do some changes.
As we migrated the backend storage, the location where I rsync the backups had been changed. I used to do
But this was no longer working, as the new mounted filesystem (NetApp) works a bit different then the old (EMC Celerra), so although the Nagios server has write permissions, rsync was not able to get the owner and set permissions, so the new command was:
I did some test runs of the backup Friday during the day and these all seemed to work fine... I'm not sure what's going on. I've ben trying to move the backups with ftp, but this does not seem to work as expected. i'll make a new thread for this, as this ftp problem is not related.
EDIT 1:
Ok, in the meantime it seems soms php process is using 100 % cpu, I saw this same proces sing 100 % cpu this morning. I attached a screenshot. What could be causing this process to use 100 % cpu?
EDIT 2:
Ok, in the meantime I discovered that /usr/local/nagiosxi/cron/cleaner.php is the evil command using up all the server resources (CPU + disk ^^)
So what is this /usr/bin/php -q /usr/local/nagiosxi/cron/cleaner.php doing and how do I stop it from making my server crash again?
EDIT 3:
It seems the cleaner.log file is again 759575031549 bytes! I opened the file and it's full of
Local scheduled backups is disabled! only ftp backups is enabled. What can I do to make this stop? kill the process?
Thanks..
Willem
Apparently this weekend our Nagios production server crashed, since a file in /usr/local:nagiosxi/var named cleaner.log took all available disk space. After deleting the file, the disk space was not automatically freed. We tried restarting httpd, postgresql, mysqld, initd and nagios service, but the used disk space did not became available, so I had to reboot the server (which did seem to free up the used disk space) and had to execute the mysql repair script in order to make Nagios XI work again.
Could I please get some help in finding out what the reason was that this file grew so excessively? Se screenshot for more details..
Df -h after reboot.
Code: Select all
df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
34G 14G 19G 43% /
tmpfs 1.9G 0 1.9G 0% /dev/shm
/dev/sda1 97M 28M 65M 31% /boot
As we migrated the backend storage, the location where I rsync the backups had been changed. I used to do
Code: Select all
rsync --remove-source-files -azv /store/backups/nagiosxi /var/Digipolis/BackupCode: Select all
rsync --remove-source-files --no-perms -r --no-o --no-g --inplace /store/backups/nagiosxi /var/Digipolis/BackupEDIT 1:
Ok, in the meantime it seems soms php process is using 100 % cpu, I saw this same proces sing 100 % cpu this morning. I attached a screenshot. What could be causing this process to use 100 % cpu?
EDIT 2:
Ok, in the meantime I discovered that /usr/local/nagiosxi/cron/cleaner.php is the evil command using up all the server resources (CPU + disk ^^)
Code: Select all
ps -eo pcpu,pid,user,args | sort -k 1 -r | head -25
%CPU PID USER COMMAND
96.8 3418 nagios /usr/bin/php -q /usr/local/nagiosxi/cron/cleaner.php
8.3 1438 root [flush-253:1]
3.4 8777 mysql /usr/libexec/mysqld --basedir=/usr --datadir=/var/lib/mysql --user=mysql --log-error=/var/log/mysqld.log --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/lib/mysql/mysql.sock
2.1 24163 nagios /usr/bin/php -q /usr/local/nagiosxi/cron/eventman.php
2.0 11126 apache /usr/sbin/httpd
1.9 25224 apache /usr/sbin/httpd
1.7 25226 apache /usr/sbin/httpd
1.7 25225 apache /usr/sbin/httpd
1.6 25 root [ksoftirqd/5]
1.6 25250 apache /usr/sbin/httpd
1.6 25227 apache /usr/sbin/httpd
1.6 24161 nagios /usr/bin/php -q /usr/local/nagiosxi/cron/perfdataproc.php
1.5 32662 nagios /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
1.5 25248 apache /usr/sbin/httpd
1.5 25245 apache /usr/sbin/httpd
1.4 25244 apache /usr/sbin/httpd
1.4 25243 apache /usr/sbin/httpd
1.4 21237 apache /usr/sbin/httpd
1.3 24162 nagios /usr/bin/php -q /usr/local/nagiosxi/cron/cmdsubsys.php
1.3 24157 nagios /usr/bin/php -q /usr/local/nagiosxi/cron/feedproc.php
1.2 3424 apache /usr/sbin/httpd
1.2 24155 nagios /usr/bin/php -q /usr/local/nagiosxi/cron/sysstat.php
1.1 3188 apache /usr/sbin/httpd
1.1 3187 apache /usr/sbin/httpd
EDIT 3:
It seems the cleaner.log file is again 759575031549 bytes! I opened the file and it's full of
Code: Select all
PHP Warning: readdir() expects parameter 1 to be resource, boolean given in /usr/local/nagiosxi/html/includes/components/scheduledbackups/scheduledbackups.inc.php on line 429
PHP Warning: readdir() expects parameter 1 to be resource, boolean given in /usr/local/nagiosxi/html/includes/components/scheduledbackups/scheduledbackups.inc.php on line 429
PHP Warning: readdir() expects parameter 1 to be resource, boolean given in /usr/local/nagiosxi/html/includes/components/scheduledbackups/scheduledbackups.inc.php on line 429
PHP Warning: readdir() expects parameter 1 to be resource, boolean given in /usr/local/nagiosxi/html/includes/components/scheduledbackups/scheduledbackups.inc.php on line 429
PHP Warning: readdir() expects parameter 1 to be resource, boolean given in /usr/local/nagiosxi/html/includes/components/scheduledbackups/scheduledbackups.inc.php on line 429
PHP Warning: readdir() expects parameter 1 to be resource, boolean given in /usr/local/nagiosxi/html/includes/components/scheduledbackups/scheduledbackups.inc.php on line 429
Thanks..
Willem