Lookback period issue regression in 1.4

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
weveland
Posts: 125
Joined: Tue Aug 11, 2015 4:10 pm
Location: cat /dev/urandom > /dev/sda

Lookback period issue regression in 1.4

Post by weveland »

Sorry for the clickbaity style title. But It looks like the lookback period issue I reported with the previous release may have returned. After upgrading to 1.4 I'm getting alerts for checks with lengthy lookback periods. I can view the alert in the dashboard and visually see the results. But the return is 0 events, hence the alarm.

--
Wayne
weveland
Posts: 125
Joined: Tue Aug 11, 2015 4:10 pm
Location: cat /dev/urandom > /dev/sda

Re: Lookback period issue regression in 1.4

Post by weveland »

Ok deactivating and reactivating the alert got it working properly. So my initial assessment may have been incorrect. Any ideas what could have happened here?
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Lookback period issue regression in 1.4

Post by rkennedy »

Something may have changed with the configuration, are you seeing any alerts anymore or is it functioning as expected?
Former Nagios Employee
weveland
Posts: 125
Joined: Tue Aug 11, 2015 4:10 pm
Location: cat /dev/urandom > /dev/sda

Re: Lookback period issue regression in 1.4

Post by weveland »

It seems to be fine at the moment.
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Lookback period issue regression in 1.4

Post by jolson »

Any ideas what could have happened here?
Not sure - is it possible that your single alert has been misbehaving since before the fix was put in place? The following fix was implemented:
Fixed alert run end time slight offset on slow systems

The above bug was caused to to an inconsistency in the time that alerts were scheduled to run, and when they actually ran. There is a potential that the alert could miss some time, which may result in a missed alert here and there - credit to @Jklre for pointing this out.

After the alert subsystem was upgraded, something must have happened to cause your alerts to regress as they did - have you noticed any sort of inconsistency since making this post?
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
weveland
Posts: 125
Joined: Tue Aug 11, 2015 4:10 pm
Location: cat /dev/urandom > /dev/sda

Re: Lookback period issue regression in 1.4

Post by weveland »

Not so far. Before disabling and re-enabling the alert I did try manually running it a few times, so maybe it was modified in the config files. Then when I deactivated and re-activated it, it was pulled from the database and put down correctly.

That's my line of thought.
User avatar
hsmith
Agent Smith
Posts: 3539
Joined: Thu Jul 30, 2015 11:09 am
Location: 127.0.0.1
Contact:

Re: Lookback period issue regression in 1.4

Post by hsmith »

Let's monitor it for a couple of days to see if it comes back.
Former Nagios Employee.
me.
weveland
Posts: 125
Joined: Tue Aug 11, 2015 4:10 pm
Location: cat /dev/urandom > /dev/sda

Re: Lookback period issue regression in 1.4

Post by weveland »

Unfortunately same issue again this morning. Alert fired at a slightly different time this morning.

Yesterday: 6:31 AM
Had to deactivate and reactivate alarm to clear and return OK.

Today: 6:54 AM
Just had to re-run check manually and alarm returned OK.
User avatar
hsmith
Agent Smith
Posts: 3539
Joined: Thu Jul 30, 2015 11:09 am
Location: 127.0.0.1
Contact:

Re: Lookback period issue regression in 1.4

Post by hsmith »

How long are your check intervals/lookback set to? I want to test this on my end.
Former Nagios Employee.
me.
weveland
Posts: 125
Joined: Tue Aug 11, 2015 4:10 pm
Location: cat /dev/urandom > /dev/sda

Re: Lookback period issue regression in 1.4

Post by weveland »

Checks run every 5 minutes. Lookback period is 5 hours.

As a side note. I didn't get an alarm this morning
Locked