host freshness alert – Nagios LS 2.1.7

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
Locked
dariusz.nalazek
Posts: 39
Joined: Thu Nov 16, 2017 6:46 am

host freshness alert – Nagios LS 2.1.7

Post by dariusz.nalazek »

Hello.

Nagios LS 2.1.7 starts reporting alerts for host freshness and it's very nice (earlier version wasn’t recognizing host freshness)

But…
The data attached to alert mail looks like random data or not relevant to alert….

CRITICAL: 4 non-sending hosts found |hosts=4;0;0
See the last 24h in the Nagios Log Server dashboard.

1) For test I shutdown 2 hosts and while test time i had 1 real case (unplanned). LS reported 4 hosts not sending data, instead of 3, after fixing 3 hosts (2 test + 1 real one) system return to “green” state and reports 0 hosts not sending.



2) I was trying to figure out the 1-2 servers not sending data, the real case.... the link in alert mail (Nagios Log Server dashboard) points to json dictionary with random data, so the data is useless for figure out source of alert. Luckily host report page works well and has correct data so it was possible to find out malfunction host, however alert point out 4 servers, host report page was showing 3)

Data under link "Nagios Log Server dashboard." in alert mail:
(Json data from mail - header only)

Code: Select all

{
    "took": 36,
    "timed_out": false,
    "_shards": {
        "total": 10,
        "successful": 10,
        "failed": 0
    },
    "hits": {
        "total": 12684468,
        "max_score": 1,
        "hits": [
            {….
2+1 not sending hosts… json says “10”, none of the 10 entries/dictionaries hast even one data pointing to correct, not sending, server.
The json looks like some random data... not relevant to alert :(

DarekN.
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: host freshness alert – Nagios LS 2.1.7

Post by cdienger »

What the JSON message cut off? Can you provide the full message if so?

Are you able to relably reproduce this? I'll have to look into it and it may be necessary to try and replicate.

Please PM me a profile from the system. It can be gathered under Admin > System > System Status > Download System Profile or from the command line with:

Code: Select all

/usr/local/nagioslogserver/scripts/profile.sh
This will create /tmp/system-profile.tar.gz.

Note that this file can be very large and may not be able to be uploaded through the system. This is usually due to the logs in the Logstash and/or Elasticsearch directories found in it. If it is too large, please open the profile, extract these directories/files and send them separately.

I'd also like to get a copy of the current settings index. This can be gathered by running:

Code: Select all

curl -XPOST http://localhost:9200/nagioslogserver/_export?path=/tmp/nagioslogserver.tar.gz
The file it creates and that we'd like to see is /tmp/nagioslogserver.tar.gz.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
dariusz.nalazek
Posts: 39
Joined: Thu Nov 16, 2017 6:46 am

Re: host freshness alert – Nagios LS 2.1.7

Post by dariusz.nalazek »

Yes, files I'll send on PM.

Yesterday I have made the test again.
This time amount of missing hosts was OK. (2 of 2)
Json data in mail looks like random data again.

Alarm check and send alert mail is once per hour.
Report of not sending hosts shows per 24h...
So if I would like to determine unknown host, not sending data to LS, I have to wait up to 24h for daily report, to find out the missing host.
well.. as alternative I could check out each log's stream from each host... but it's not efficient and really time consuming (for now, I have 200+ hosts, but after the 1st - implementation phase of Nagios LS, it will be at least 10 times more hosts).

DarekN.
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: host freshness alert – Nagios LS 2.1.7

Post by cdienger »

Thanks for the data. Since the behavior occurs after 24 hours I don't have an update at this time but will be looking into it.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
dariusz.nalazek
Posts: 39
Joined: Thu Nov 16, 2017 6:46 am

Re: host freshness alert – Nagios LS 2.1.7

Post by dariusz.nalazek »

Since the behavior occurs after 24 hours
Nagios LS recognize missing hosts after (up to) 1h... and send mail with alert, but the mail has link to json data not relevant to alert (some 10 random log events in json format, eg. https://nagios_ls_ip_/nagioslogserver/alerts/show_custom/AXOrgn8pIZZit06_iL-b).
After 24h missing hosts are shown in https://nagios_ls_ip_/nagioslogserver/reports/hosts

It would to be great to be able to find out the missing hosts after receiving alert mail...

For example:
5:10AM - some host stop sending logs
host freshness is check out per hour, so at 6:00AM or at 7:00AM I should receive alert mail (with link to some random json data)
every next hour at 8:00,9:00,10:00,... and so on, I'll receive similar alarm mail with random jason data...
after 24h and after receiving xx alert emails the missing host will show up in https://nagios_ls_ip_/nagioslogserver/reports/hosts, so I'll be able to determine what host is source of problem.

DarekN.
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: host freshness alert – Nagios LS 2.1.7

Post by benjaminsmith »

Hi Darius,

Can you post screenshots or cut and paste the full messages your getting with non-relevant json data ( I cannot access the links). I know you have provided some of this but it would be helpful to have more data for comparison.

If this information has sensitive information, please send it over in a private message. Thanks, Benjamin
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
dariusz.nalazek
Posts: 39
Joined: Thu Nov 16, 2017 6:46 am

Re: host freshness alert – Nagios LS 2.1.7

Post by dariusz.nalazek »

Support data send on PM.

LInks above shows only where I was looking for informations (standard LS instalation).
I can make one more test (eg. 48h, while weekend) and collect data for some more hosts.
And then send json data from link in mail + screen from LS's hosts raport or some more info about it.

DarekN.
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: host freshness alert – Nagios LS 2.1.7

Post by benjaminsmith »

Hi DarekN,
I can make one more test (eg. 48h, while weekend) and collect data for some more hosts.
And then send json data from link in mail + screen from LS's hosts raport or some more info about it.
Thank you, that will be very helpful.

We'll wait for your update.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
dariusz.nalazek
Posts: 39
Joined: Thu Nov 16, 2017 6:46 am

Re: host freshness alert – Nagios LS 2.1.7

Post by dariusz.nalazek »

Hello.

I have made some additional test. Result is the same as before. It looks like repeatable behavior
1) I power off 4 hosts
2) After 1h I got alert mail like below:
SERVER_LOG_GONE returned with a CRITICAL state at Mon, 17 Aug 2020 01:32:51 +0200
The alert was processed with the following thresholds:
• Lookback period: 24h
• Warning: 0
• Critical: 0
Here is the full alert output:
CRITICAL: 4 non-sending hosts found |hosts=4;0;0
See the last 24h in the Nagios Log Server dashboard.
Nagios Log Server
3) Every next hour I got alert mail, similar to this one above.
4) I wasn’t able to check out which host is not sending data using link in alert mail.
And I have no clue how to check out which host is “dead” in other way then check out one by one and look for data gap. It’s annoying if you have hundreds of hosts to check out.
Link in alert mail points at irrelevant to alert some Jason data
5) After 24h and xx alert mails, missing hosts show up in host/hosts report. So I could find out/verify what host stop sending log.
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: host freshness alert – Nagios LS 2.1.7

Post by benjaminsmith »

Hi,

Got it now, that's very helpful, thank you! We're going to file a bug report for the issues here and it will be corrected in the next maintenance release, 2.1.8.

Benjamin
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked