No space left on device
Posted: Fri Jan 12, 2024 6:05 am
We have since last year an issue with our Nagios server.
It was not monitoring any services anymore and the last checked date was far in the past. After reboot of the services the GUI was not accessible anymore. When we check the service, it should the error "No space left on device".
In the end we found a lot of old log files in the folder /usr/local/nagios/var/archives and removed them. After reboot the services were ok again and the GUI worked again.
Unfortunately the joy was only short lived because now the services are not monitored again. I checked again the archives folder but there were not that many new files created that would take up all diskspace. I removed a few and rebooted the services but still the below same error.
Can you help me further with this? What can I check or clean up more here?
Nagios Core 4.4.9
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 2022-11-16
License: GPL
Website: https://www.nagios.org
Reading configuration data...
Read main config file okay...
WARNING: The normal_check_interval attribute is deprecated and will be removed in future versions. Please use check_interval instead.
WARNING: The retry_check_interval attribute is deprecated and will be removed in future versions. Please use retry_interval instead.
Read object config files okay...
Running pre-flight check on configuration data...
Checking objects...
Checked 526 services.
Checked 241 hosts.
Checked 16 host groups.
Checked 0 service groups.
Checked 4 contacts.
Checked 1 contact groups.
Checked 57 commands.
Checked 5 time periods.
Checked 0 host escalations.
Checked 0 service escalations.
Checking for circular paths...
Checked 241 hosts
Checked 0 service dependencies
Checked 0 host dependencies
Checked 5 timeperiods
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...
Error: Unable to write to temp_path ('/tmp') - No space left on device
Error: Unable to write to check_result_path ('/usr/local/nagios/var/spool/checkresults') - No space left on device
Total Warnings: 0
Total Errors: 2
***> One or more problems was encountered while running the pre-flight check...
Check your configuration file(s) to ensure that they contain valid
directives and data definitions. If you are upgrading from a previous
version of Nagios, you should be aware that some variables/definitions
may have been removed or modified in this version. Make sure to read
the HTML documentation regarding the config files, as well as the
'Whats New' section to find out what has changed.
This is the result of the df command:
ubuntu@RSB-VWA-T-MON:~$ df -h
Filesystem Size Used Avail Use% Mounted on
tmpfs 781M 3.1M 778M 1% /run
/dev/mmcblk0p2 59G 50G 6.8G 88% /
tmpfs 3.9G 0 3.9G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
/dev/mmcblk0p1 253M 122M 131M 49% /boot/firmware
tmpfs 781M 4.0K 781M 1% /run/user/1000
It was not monitoring any services anymore and the last checked date was far in the past. After reboot of the services the GUI was not accessible anymore. When we check the service, it should the error "No space left on device".
In the end we found a lot of old log files in the folder /usr/local/nagios/var/archives and removed them. After reboot the services were ok again and the GUI worked again.
Unfortunately the joy was only short lived because now the services are not monitored again. I checked again the archives folder but there were not that many new files created that would take up all diskspace. I removed a few and rebooted the services but still the below same error.
Can you help me further with this? What can I check or clean up more here?
Nagios Core 4.4.9
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 2022-11-16
License: GPL
Website: https://www.nagios.org
Reading configuration data...
Read main config file okay...
WARNING: The normal_check_interval attribute is deprecated and will be removed in future versions. Please use check_interval instead.
WARNING: The retry_check_interval attribute is deprecated and will be removed in future versions. Please use retry_interval instead.
Read object config files okay...
Running pre-flight check on configuration data...
Checking objects...
Checked 526 services.
Checked 241 hosts.
Checked 16 host groups.
Checked 0 service groups.
Checked 4 contacts.
Checked 1 contact groups.
Checked 57 commands.
Checked 5 time periods.
Checked 0 host escalations.
Checked 0 service escalations.
Checking for circular paths...
Checked 241 hosts
Checked 0 service dependencies
Checked 0 host dependencies
Checked 5 timeperiods
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...
Error: Unable to write to temp_path ('/tmp') - No space left on device
Error: Unable to write to check_result_path ('/usr/local/nagios/var/spool/checkresults') - No space left on device
Total Warnings: 0
Total Errors: 2
***> One or more problems was encountered while running the pre-flight check...
Check your configuration file(s) to ensure that they contain valid
directives and data definitions. If you are upgrading from a previous
version of Nagios, you should be aware that some variables/definitions
may have been removed or modified in this version. Make sure to read
the HTML documentation regarding the config files, as well as the
'Whats New' section to find out what has changed.
This is the result of the df command:
ubuntu@RSB-VWA-T-MON:~$ df -h
Filesystem Size Used Avail Use% Mounted on
tmpfs 781M 3.1M 778M 1% /run
/dev/mmcblk0p2 59G 50G 6.8G 88% /
tmpfs 3.9G 0 3.9G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
/dev/mmcblk0p1 253M 122M 131M 49% /boot/firmware
tmpfs 781M 4.0K 781M 1% /run/user/1000