cannot update mailbox /var/mail/nagios full

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
monit_burb
Posts: 52
Joined: Fri Sep 23, 2016 3:00 am

cannot update mailbox /var/mail/nagios full

Post by monit_burb »

Hi,

Yesterday we started to get hundreds of notifications even tough everything seemed OK on NagiosXI GUI. After cheeking the logs I saw the message

Code: Select all

cannot update mailbox /var/mail/nagios for user nagios. error writing message: File too large
I proceeded to clear the file with the following command and restarted postfix

Code: Select all

>/var/mail/nagios
I can see appearances of that error a few weeks prior to yesterday incident so I do not know if this is the cause.

The restart of the service didn't seems to do the trick so I restarted the whole nagios server and after about 1 hour of hundreds of emails, it finally stopped and mails started to work OK.

I really have no clue why this have happened. Server have been working fine for over 1 year but recently some configuration changes were made to make the server CIS complaint that might have caused the issue.

Is normal for the /var/mail/nagios to just keep growing and growing? Or some underlying issues caused it to not been cleared?

Here is an extract of the maillog were I think the root cause is. Emails get bounced because that unknown user.

Code: Select all

Nov  8 08:13:44 ESBARLMONAPP06 postfix/pickup[28434]: EBC73BB: uid=996 from=<nagios>
Nov  8 08:13:44 ESBARLMONAPP06 postfix/cleanup[6775]: EBC73BB: message-id=<[email protected]>
Nov  8 08:13:44 ESBARLMONAPP06 postfix/qmgr[1750]: EBC73BB: from=<[email protected]>, size=779, nrcpt=1 (queue active)
Nov  8 08:13:44 ESBARLMONAPP06 postfix/local[6780]: EBC73BB: to=<[email protected]>, orig_to=<$>, relay=local, delay=0.01, delays=0.01/0/0/0, dsn=
5.1.1, status=bounced (unknown user: "$")
Nov  8 08:13:44 ESBARLMONAPP06 postfix/cleanup[6775]: EE0774F2: message-id=<[email protected]>
Nov  8 08:13:44 ESBARLMONAPP06 postfix/bounce[6781]: EBC73BB: sender non-delivery notification: EE0774F2
Nov  8 08:13:44 ESBARLMONAPP06 postfix/qmgr[1750]: EE0774F2: from=<>, size=2712, nrcpt=1 (queue active)
Nov  8 08:13:44 ESBARLMONAPP06 postfix/qmgr[1750]: EBC73BB: removed
Nov  8 08:13:44 ESBARLMONAPP06 postfix/local[6780]: EE0774F2: to=<[email protected]>, relay=local, delay=0.01, delays=0/0/0/0, dsn=2.0.0, sta
tus=sent (delivered to mailbox)
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: cannot update mailbox /var/mail/nagios full

Post by npolovenko »

Hello, @monit_burb. Flood of emails without a reason often points to a database corruption. Please run the following command to check mysql tables:
mysqlcheck -r -f -uroot -pnagiosxi --all-databases
This command works great to truncate outgoing mail entries in the database:
echo "truncate table xi_events; truncate table xi_meta; truncate table xi_eventqueue;" | mysql -uroot -pnagiosxi nagiosxi
Also, please check the hard drive with df -h and make sure that partitions are not maxed out in space.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
monit_burb
Posts: 52
Joined: Fri Sep 23, 2016 3:00 am

Re: cannot update mailbox /var/mail/nagios full

Post by monit_burb »

npolovenko wrote:Hello, @monit_burb. Flood of emails without a reason often points to a database corruption. Please run the following command to check mysql tables:
mysqlcheck -r -f -uroot -pnagiosxi --all-databases
This command returned all OK

This command works great to truncate outgoing mail entries in the database:
echo "truncate table xi_events; truncate table xi_meta; truncate table xi_eventqueue;" | mysql -uroot -pnagiosxi nagiosxi
I got no output from performing that command today but the issue was days ago.

Also, please check the hard drive with df -h and make sure that partitions are not maxed out in space.
We have plenty of free space in all partitions of the servers.

Is also worth mentioning that we had a network issue that day and Nagios server most likely was not able to connect to the SMTP servers as it sits on a different site. The mailing issue started about 3 or 4 hours after the network problems were sorted out.

Any idea what else can I check?
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: cannot update mailbox /var/mail/nagios full

Post by npolovenko »

@monit_burb, If Nagios server couldn't communicate with the SMTP server for some time I could see how emails could've gotten spooled up. And when the connection got reestablished all the spooled emails were sent out at once causing the flooding you described.
As far as this message goes mailbox /var/mail/nagios full, I think nagios mail got full after many services triggered notifications as the result of various network issues. You could expand the size of the nagios mailbox, but I think it's a one time case and I don't see how it would cause issues with receiving emails on regular emails accounts or flooding.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Locked