Page 3 of 4

Re: Lookback period issue regression in 1.4

Posted: Tue Jan 26, 2016 4:23 pm
by weveland
Well Stallman is silly hipster hippie so there!

Umm no, plenty of disk space.

Code: Select all

[root@nagiosls ~]# df -h
Filesystem      Size  Used Avail Use% Mounted on
rootfs             99G   58G   40G  60%   /
devtmpfs        16G  156K   16G   1%   /dev
tmpfs             16G     0      16G   0%   /dev/shm
/dev/sda1      99G   58G   40G  60%   /
/dev/sdb     1008G 106G  852G 12%   /logdump
/dev/sdc       739G   47G  655G  7%   /backups
(Can't we just get a fixed-width font here, for the love of all that is holy???)

Re: Lookback period issue regression in 1.4

Posted: Tue Jan 26, 2016 4:28 pm
by weveland
I know what I said was tantamount to blasphemy. But I said it. So there!

Re: Lookback period issue regression in 1.4

Posted: Wed Jan 27, 2016 12:56 pm
by weveland
Oh come on guys. I didn't upset you that much did I?

Re: Lookback period issue regression in 1.4

Posted: Wed Jan 27, 2016 1:04 pm
by jolson
Oh come on guys. I didn't upset you that much did I?
It hurt so much that I switched from vim to emacs :(

I don't know what's wrong with the Administration page - there are no obvious problems. My best guess is that the kibana-int database is somehow different than it was before this happened - was there a particular event that caused this to begin failing?

I'd like you to backup your config backups just in case we wind up needing to restore to one of them:

Code: Select all

cp /store/backups/nagioslogserver/* ~
Could I have you open a second thread for this issue? We can tackle the lookback regression here, and then we can tackle the disappearing Administration screen in the second thread. Thanks Wayne!

Re: Lookback period issue regression in 1.4

Posted: Wed Jan 27, 2016 1:09 pm
by weveland
emacs, bloody hell. I'm sorry but we can't be friends anymore.

What's next? Windows??

Sigh..



I will open a new topic.

Re: Lookback period issue regression in 1.4

Posted: Wed Jan 27, 2016 2:02 pm
by jolson
Regarding the alert misses, I think that it would be a good idea to disable your backup system for a few days (set the interval to 3 days or so under Administration -> Command Subsystem). After the backups have been paused, I'm interested in seeing if the alert subsystem continues to misfire. I am wondering if the new backup system interferes with the alert subsystem.

Thanks Wayne!

Re: Lookback period issue regression in 1.4

Posted: Wed Jan 27, 2016 2:06 pm
by weveland
I'd love to. But the Administration panel is missing remember?

-W

Re: Lookback period issue regression in 1.4

Posted: Wed Jan 27, 2016 2:08 pm
by jolson
I do - so we'll get that fixed before proceeding with the lookback regression. Now that we have a theory I'll see if I can't replicate the problem in the lab while we work on the Administration panel.

Re: Lookback period issue regression in 1.4

Posted: Fri Jan 29, 2016 11:10 am
by weveland
Gentlemen,

Changing the PHP max ram allocation didn't fix the issue. The alert still fired this morning around the same timeframe.

Down 6:45 AM - Recovery 7:00AM

Re: Lookback period issue regression in 1.4

Posted: Fri Jan 29, 2016 3:08 pm
by jolson
Alright, in that case could you temporarily disable your backup system from firing - for perhaps a day or so to see whether or not the backup system is impacting the alert subsystem. Go to the command subsystem and schedule the backup_maintenance command a few days out to accomplish this - if that doesn't help we'll investigate why. Thanks!