No space left on device

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Post Reply
ptran
Posts: 35
Joined: Fri Jan 12, 2024 5:50 am

No space left on device

Post by ptran »

We have since last year an issue with our Nagios server.

It was not monitoring any services anymore and the last checked date was far in the past. After reboot of the services the GUI was not accessible anymore. When we check the service, it should the error "No space left on device".

In the end we found a lot of old log files in the folder /usr/local/nagios/var/archives and removed them. After reboot the services were ok again and the GUI worked again.

Unfortunately the joy was only short lived because now the services are not monitored again. I checked again the archives folder but there were not that many new files created that would take up all diskspace. I removed a few and rebooted the services but still the below same error.

Can you help me further with this? What can I check or clean up more here?


Nagios Core 4.4.9
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 2022-11-16
License: GPL

Website: https://www.nagios.org
Reading configuration data...
Read main config file okay...
WARNING: The normal_check_interval attribute is deprecated and will be removed in future versions. Please use check_interval instead.
WARNING: The retry_check_interval attribute is deprecated and will be removed in future versions. Please use retry_interval instead.
Read object config files okay...

Running pre-flight check on configuration data...

Checking objects...
Checked 526 services.
Checked 241 hosts.
Checked 16 host groups.
Checked 0 service groups.
Checked 4 contacts.
Checked 1 contact groups.
Checked 57 commands.
Checked 5 time periods.
Checked 0 host escalations.
Checked 0 service escalations.
Checking for circular paths...
Checked 241 hosts
Checked 0 service dependencies
Checked 0 host dependencies
Checked 5 timeperiods
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...
Error: Unable to write to temp_path ('/tmp') - No space left on device
Error: Unable to write to check_result_path ('/usr/local/nagios/var/spool/checkresults') - No space left on device

Total Warnings: 0
Total Errors: 2

***> One or more problems was encountered while running the pre-flight check...

Check your configuration file(s) to ensure that they contain valid
directives and data definitions. If you are upgrading from a previous
version of Nagios, you should be aware that some variables/definitions
may have been removed or modified in this version. Make sure to read
the HTML documentation regarding the config files, as well as the
'Whats New' section to find out what has changed.




This is the result of the df command:

ubuntu@RSB-VWA-T-MON:~$ df -h
Filesystem Size Used Avail Use% Mounted on
tmpfs 781M 3.1M 778M 1% /run
/dev/mmcblk0p2 59G 50G 6.8G 88% /
tmpfs 3.9G 0 3.9G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
/dev/mmcblk0p1 253M 122M 131M 49% /boot/firmware
tmpfs 781M 4.0K 781M 1% /run/user/1000
cnorell
Developer
Posts: 65
Joined: Mon Nov 27, 2017 3:08 pm

Re: No space left on device

Post by cnorell »

ptran,

The more hosts and services being monitored the more logs XI will create, unfortunately. One option would be to create a cron job that will clean up the directories in question - so long as you don't care about the logs.

Are you taking scheduled backups? If you are, these can take up a significant amount of disk space as well.

Here's an article that goes over this issue with some solutions: https://nagiosenterprises.my.site.com/s ... r-202120ec

Best Regards,

Cory Norell
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
ptran
Posts: 35
Joined: Fri Jan 12, 2024 5:50 am

Re: No space left on device

Post by ptran »

This is a legacy system of our previous system admin who has left the company and there is not much info about it.

I don't think there is a scheduled backup of the system as I cannot find back any /store folder back in the file system.

I already deleted all the files except the latest one from the folder "/usr/local/nagios/var/archives/" but still the system complains there is not enough diskspace. What else can I clean more?

Are there other places where log files are kept further? Is there a command that I can use to list for new log files being created somewhere else? The current folder "/usr/local/nagios/var/spool/checkresults" is already empty and the folder "/tmp" contains an old folder from last year but I am not able to view the content of this folder and also not able to delete the folder.
cnorell
Developer
Posts: 65
Joined: Mon Nov 27, 2017 3:08 pm

Re: No space left on device

Post by cnorell »

ptran,

Here's a document on log file locations: https://assets.nagios.com/downloads/nag ... ptions.pdf

You can check the scheduled backup setting by navigating to Admin > System Backups > Scheduled Backups

Here's a document on scheduled backups: https://support.nagios.com/kb/article.p ... %20removed.

As another option, you could try to find the highest disk usage directories on your server by running the following command:

Code: Select all

du -a /home | sort -n -r | head -n 5
Hopefully some of that helps.

Best Regards,

Cory Norell
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
ptran
Posts: 35
Joined: Fri Jan 12, 2024 5:50 am

Re: No space left on device

Post by ptran »

Thank you for your answer.

I used the command "du -a /home" and found the folder /home/nagios/gammu/sent back that contained a lot of old SMS messages. I removed the files in this folder and now the nagios services could be started without a space issue.

Thanks for your help.
ptran
Posts: 35
Joined: Fri Jan 12, 2024 5:50 am

Re: No space left on device

Post by ptran »

The Nagios services have run ok for a day and now again we have the same disk space issue.

I ran de du command again and it comes with different folders. I checked them out but cannot find much to clean up this time. I am wondering what is eating up all the free diskspace that I am clearing each time?

ubuntu@RSB-VWA-T-MON:~$ du -a /home | sort -n -r | head -n 5
59452 /home
57444 /home/ubuntu
15988 /home/ubuntu/.cache
15980 /home/ubuntu/.cache/go-build
15600 /home/ubuntu/gorepo
User avatar
jmichaelson
Posts: 117
Joined: Wed Aug 23, 2023 1:02 pm

Re: No space left on device

Post by jmichaelson »

Let's try that du command again, but from the root directory, and go a few lines deeper in the output:

du -a /home | sort -n -r | head -n 20
Please let us know if you have any other questions or concerns.

-Jason
ptran
Posts: 35
Joined: Fri Jan 12, 2024 5:50 am

Re: No space left on device

Post by ptran »

Do you mean with the below command? This is what I get then.


ubuntu@RSB-VWA-T-MON:~$ du -a / | sort -n -r | head -n 5
du: cannot read directory '/sys/kernel/tracing': Permission denied
du: cannot read directory '/sys/kernel/debug': Permission denied
du: cannot read directory '/sys/fs/pstore': Permission denied
du: cannot read directory '/sys/fs/bpf': Permission denied
du: cannot read directory '/root': Permission denied
sort: cannot create temporary file in '/tmp': No space left on device
kg2857
Posts: 237
Joined: Wed Apr 12, 2023 5:48 pm

Re: No space left on device

Post by kg2857 »

Filling up a filespace isn't really a nagios issue and the following message isn't very hard to understand.
sort: cannot create temporary file in '/tmp': No space left on device
ptran
Posts: 35
Joined: Fri Jan 12, 2024 5:50 am

Re: No space left on device

Post by ptran »

Nagios services is the only thing that runs on this appliance and why is this not a nagios issue? I need to know where the system is creating new files and which files are cloching up the system after I have cleared some old files.
Post Reply