Nagios log rotation quit working
Posted: Wed Dec 02, 2020 3:22 pm
I noticed recently that our nagios.log rotation stopped working. My first clue came when I tried to view the alert history on a service check, and when I clicked the back arrow ("Latest Archive"), there were errors such as:
Aha! It seems we have a log rotation issue!
I checked the /var/log/nagios/nagios.log and the /var/log/messages files and found errors similar to this each day at precisely midnight:
Some configuration information:
1. We are allowing Nagios to perform the log rotation (rather than creating our own cron/logrotate jobs to do so). From our nagios.cfg file:
2. Members of the nagios group are the nagios (973) and apache users. From /etc/group:
Other troubleshooting steps I have performed:
1. I tried changing to the nagios user and manually calling the logrotate command (pointing to the default /etc/logrotate.d/nagios file after uncommenting the commands). This produced the same "permission denied" error that I found in the nagios.log and messages log files (see above). I tried this with the nagios service both stopped and running.
2. Thinking that maybe the current large nagios.log file was corrupted somehow, I manually moved it into the archive directory (using sudo) and re-created the /var/log/nagios/nagios.log with its original permissions. I then rebooted and waited a day to see if log rotation took place. It didn't. Same errors in the logs.
Does anyone have any ideas on how I can troubleshoot this further, or--better yet--have a solution to this problem?
Thanks in advance!
I looked in the /var/log/nagios/archives directory, and, sure enough, the most recent log file was from September. The /var/log/nagios/nagios.log file was much larger than any of those in the archive directory, and I found that it contained entries from just after the most recent file in the archive directory.Error: Cannot open log file '/var/log/nagios/archives/nagios-12-02-2020-00.log' for reading!
Aha! It seems we have a log rotation issue!
I checked the /var/log/nagios/nagios.log and the /var/log/messages files and found errors similar to this each day at precisely midnight:
After reading some online forums citing similar issues, I changed the permissions on the /var/log/nagios/archives directory to drwxrwxrwx (with the original owner and the group both being nagios). I rebooted and waited a day to see if that fixed the issue. It didn't.Error: Unable to rename file '/var/log/nagios/nagios.log' to '/var/log/nagios/archives/nagios-12-02-2020-00.log': Permission denied
Some configuration information:
1. We are allowing Nagios to perform the log rotation (rather than creating our own cron/logrotate jobs to do so). From our nagios.cfg file:
Code: Select all
log_file=/var/log/nagios/nagios.log
log_rotation_method=d
log_archive_path=/var/log/nagios/archivesCode: Select all
nagios:x:973:apache1. I tried changing to the nagios user and manually calling the logrotate command (pointing to the default /etc/logrotate.d/nagios file after uncommenting the commands). This produced the same "permission denied" error that I found in the nagios.log and messages log files (see above). I tried this with the nagios service both stopped and running.
2. Thinking that maybe the current large nagios.log file was corrupted somehow, I manually moved it into the archive directory (using sudo) and re-created the /var/log/nagios/nagios.log with its original permissions. I then rebooted and waited a day to see if log rotation took place. It didn't. Same errors in the logs.
Does anyone have any ideas on how I can troubleshoot this further, or--better yet--have a solution to this problem?
Thanks in advance!