server was down, now UP but many gzip processes

zaji_nms · Post by **zaji_nms** » Tue Dec 31, 2019 7:23 am

Dear Expert

Our server was down for 4 days, now its UP, but CPU/LOAD is shooting because of gzip, (we fixed by killall -9 gzip but has to repeat this cmd many many times) I think NagiosXI trying to backup all last 4 days backup which was pending due to server was down. Just think if in worse case if server down more than week or month.

I think you should fix this issue, should only give WARNING message on web(http), same way when DATABASE get crushed its giving. NagiosXI should not try all last 4 days backup (its meaning less).

do same lab test, version XI 5.x.x

Regards

Post by **mbellerue** » Thu Jan 02, 2020 11:54 am

That does sound pretty bad. What exact version of XI are you running that had this problem?

zaji_nms · Post by **zaji_nms** » Sun Jan 05, 2020 1:07 am

Happy New Year to Nagios Team and readers

we faced/experienced same issue when it was NagiosXI 5.2.3 and NagiosXI 5.2.7
better you please do some lab test on your latest XI 5.6.9

Regards

Post by **lmiltchev** » Mon Jan 06, 2020 11:24 am

I was not able to recreate the issue in-house. After powering on a server with a scheduled daily backup, only one backup (tarball) was created, not several. I didn't want to wait for 4 days, so I just changed the date/time of the server to 4 days in the future.

Also, I only had a "local" backup set up.

Are you sure that your Nagios XI server tried to run backups for the last 4 days? Is it possible that you have different backup types that were run, e.g. Local, SSH, and FTP?

It is hard to imagine that gzip would crash the server or cause such a high CPU load. We haven't experience this during our tests, and haven't heard of any other users, reporting a similar (or the same) issue. I will do some more digging into this, but so far, I haven't had any luck reproducing the problem.

zaji_nms · Post by **zaji_nms** » Tue Jan 07, 2020 1:07 am

Attn : lmiltchev

Yes, we have Local & SSH backups. (SSH on remote server).

Please note, we had this bad experience two times. I posted to benefit any other user who will/may face this problem in future.

Try to simulate in you lab (real environment with big data file 3GB+), if its very short backup file possible you will not notice.

if your digging not find any issue, you can close this POST.

Regards

Post by **lmiltchev** » Tue Jan 07, 2020 10:15 am

Try to simulate in you lab (real environment with big data file 3GB+), if its very short backup file possible you will not notice.

I would like to test this, and file a bug report if it is a bug. This could be an "edge case", and sometimes it is difficult to reproduce the issue when this happens. It would help if we had more information, which can help us lab this in-house.

1. When you say: "server was down", do you mean your server (or VM) was powered off? The reason I am asking is that it is possible that apache was not running, and you were not able to log in the web interface, but in this case, nagios would be still running in the background. I just want to find out what was the exact case.

2. What is the "Backup Limit" that you are currently using?

3. Do you have any "partial" backups (folders) in your backup location(s)? Can you list your backups to see what is the "normal" size of the tarballs?

Example:

Code: Select all

ls -la /store/backups/nagiosxi/

4. Do you have anything unusual in the "/usr/local/nagiosxi/var/components/scheduledbackups.log" that can point us to the right direction?

5. Have you enabled debugging for scheduled backups in order to troubleshoot the issue?

https://support.nagios.com/kb/article/n ... l-578.html

6. Were the cron jobs running when you were having this issue?

Code: Select all

 ps -ef | grep cron | grep -v grep

Once I have this info, I will try testing the scheduled backups again in a large environment. Thanks!

Nagios Support Forum

server was down, now UP but many gzip processes

server was down, now UP but many gzip processes

Re: server was down, now UP but many gzip processes

Re: server was down, now UP but many gzip processes

Re: server was down, now UP but many gzip processes

Re: server was down, now UP but many gzip processes

Re: server was down, now UP but many gzip processes