Lookback period issue regression in 1.4

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
User avatar
hsmith
Agent Smith
Posts: 3539
Joined: Thu Jul 30, 2015 11:09 am
Location: 127.0.0.1
Contact:

Re: Lookback period issue regression in 1.4

Post by hsmith »

I've set up my server to match. I'll see what kind of alerts I get over the weekend, and post back Monday. Jesse will also be back in Monday, so we can discuss this with him as well, it seems you two are getting to know each other pretty well!
Former Nagios Employee.
me.
weveland
Posts: 125
Joined: Tue Aug 11, 2015 4:10 pm
Location: cat /dev/urandom > /dev/sda

Re: Lookback period issue regression in 1.4

Post by weveland »

He's a bright kid. Destined for great things.

I look forward to hearing from you on Monday.
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Lookback period issue regression in 1.4

Post by jolson »

Hey Wayne, long time no see!

I spoke with the developers regarding this issue, and they'll need the following information to diagnose it:

-A screenshot of one of the problem alerts, including all of the settings that you have specifically set.

-The following files: /var/www/html/nagioslogserver/application/helpers/data_helper.php and /var/www/html/nagioslogserver/html/application/helpers/data_helper.php

-Do these alerts always trigger at a particular hour/minute, or does it seem random? I'm guessing rather random based on your output in this thread so far.

-I'm wondering if the missed triggers relate to a timezone offset in some way?

Let us know the results of the above and we'll get back to you. Thanks!
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
weveland
Posts: 125
Joined: Tue Aug 11, 2015 4:10 pm
Location: cat /dev/urandom > /dev/sda

Re: Lookback period issue regression in 1.4

Post by weveland »

Mr. Olson & Agent Smith,

Sorry I didn't get back to you yesterday, was out of the office. decrypton phrase will be in your PM's Mr. Olson.
jolson wrote: -A screenshot of one of the problem alerts, including all of the settings that you have specifically set.
You will find the screenshots in the attached archive.
jolson wrote: -The following files: /var/www/html/nagioslogserver/application/helpers/data_helper.php and /var/www/html/nagioslogserver/html/application/helpers/data_helper.php
The First file exists and is in the archive, the second file does not exist.
jolson wrote: -Do these alerts always trigger at a particular hour/minute, or does it seem random? I'm guessing rather random based on your output in this thread so far.
It appears that it occurs in the time leading up to 7:00AM EST. This strangely is also the time the daily backup runs (@ 7:00AM). I've included a chronological list of the alerts for you to compare times. Again it is in the archive.
jolson wrote: -I'm wondering if the missed triggers relate to a timezone offset in some way?
Anything is possible.

-W
nagsupport.zip
You do not have the required permissions to view the files attached to this post.
weveland
Posts: 125
Joined: Tue Aug 11, 2015 4:10 pm
Location: cat /dev/urandom > /dev/sda

Re: Lookback period issue regression in 1.4

Post by weveland »

Another side note of more urgency. The admin panel is now just a blank page. I restarted the whole server after individual component restarts didn't help (httpd, logstash, elasticsearch).
Still the same effect. Administration => Blank page
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Lookback period issue regression in 1.4

Post by jolson »

The kibana-int database is in control of the loading of the Administration panel, and it's possible that it has an unassigned shard or similar - try the following command out:

Code: Select all

curl -s 'localhost:9200/_cluster/health?level=indices&pretty'| grep kibana -A9
If the kibana-int database is healthy, check the apache logs after clicking 'Administration' - anything relevant in those logs?

Also, there are noted problems with Internet Explorer + Nagios Log Server - be sure you're using Firefox or Google Chrome.

Thanks Wayne! I'm still working on the original problem as related to this thread. I thought we had reproduced it in our lab, but unfortunately it was just a mis-firing script that caused our alert process to fail. I'll be coming back with more information when I've discussed the problem further with our developers.
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
weveland
Posts: 125
Joined: Tue Aug 11, 2015 4:10 pm
Location: cat /dev/urandom > /dev/sda

Re: Lookback period issue regression in 1.4

Post by weveland »

People still use Internet Explorer/Microsoft Edge?

"kibana-int" : {
"status" : "yellow",
"number_of_shards" : 5,
"number_of_replicas" : 1,
"active_primary_shards" : 5,
"active_shards" : 5,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 5
},

Status of system has been yellow since day 1 because it's a single host and not a cluster.
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Lookback period issue regression in 1.4

Post by jolson »

People still use Internet Explorer/Microsoft Edge?
I am of the opinion that everyone should use elinks. :ugeek:
Status of system has been yellow since day 1 because it's a single host and not a cluster.
Yup, the kibana-int database seems fine. Anything notable in the httpd logs?

Code: Select all

tail -n50 /var/log/httpd/*log
Does the behavior change between HTTP/HTTPS?
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
weveland
Posts: 125
Joined: Tue Aug 11, 2015 4:10 pm
Location: cat /dev/urandom > /dev/sda

Re: Lookback period issue regression in 1.4

Post by weveland »

Nobody uses elinks!

==> /var/log/httpd/ssl_access_log <==
172.16.140.254 - - [26/Jan/2016:16:01:44 -0500] "GET /nagioslogserver/admin HTTP/1.1" 500 -

==> /var/log/httpd/ssl_request_log <==
[26/Jan/2016:16:01:44 -0500] 172.16.140.254 TLSv1.2 ECDHE-RSA-AES128-GCM-SHA256 "GET /nagioslogserver/admin HTTP/1.1" -

==> /var/log/httpd/access_log <==
172.16.140.254 - - [26/Jan/2016:16:01:49 -0500] "GET /nagioslogserver/admin HTTP/1.1" 500 - "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:43.0) Gecko/20100101 Firefox/43.0"
172.16.140.254 - - [26/Jan/2016:16:01:56 -0500] "GET /nagioslogserver/admin HTTP/1.1" 500 - "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:43.0) Gecko/20100101 Firefox/43.0"
User avatar
hsmith
Agent Smith
Posts: 3539
Joined: Thu Jul 30, 2015 11:09 am
Location: 127.0.0.1
Contact:

Re: Lookback period issue regression in 1.4

Post by hsmith »

weveland wrote:Nobody uses elinks!
Stallman uses elinks.

This is probably an obvious question, but you didn't run out of disk space, did you?
Former Nagios Employee.
me.
Locked