Some alerts not firing
Re: Some alerts not firing
Also I want to mention that other alerts were firing during this time period.
Re: Some alerts not firing
I want you to make the following change.
Edit one of our PHP files (this one helps the alert subsystem out):
Change this (line 44):
To:
After the change, you will not need to restart anything. Let me know if your alert consistency improves after performing the above.
Edit one of our PHP files (this one helps the alert subsystem out):
Code: Select all
vi /var/www/html/nagioslogserver/application/helpers/data_helper.phpCode: Select all
$range[] = "logstash-" . date('Y.m.d', $start);Code: Select all
$range[] = "logstash-" . gmdate('Y.m.d', $start);Re: Some alerts not firing
I went ahead and made this change. Any ideas of what could be happening or anything else we can check or look at? Having a 6 hour gap with not receiving alerts is a major show stopper and has basically taken our implementation of this product to a complete stop until this is resolved.jolson wrote:I want you to make the following change.
Edit one of our PHP files (this one helps the alert subsystem out):Change this (line 44):Code: Select all
vi /var/www/html/nagioslogserver/application/helpers/data_helper.phpTo:Code: Select all
$range[] = "logstash-" . date('Y.m.d', $start);After the change, you will not need to restart anything. Let me know if your alert consistency improves after performing the above.Code: Select all
$range[] = "logstash-" . gmdate('Y.m.d', $start);
Re: Some alerts not firing
This is a high priority issue, and you're not the only person experiencing it - with that said, we're certainly working on resolving it. I appreciate your patience.
After making the above change, did the behavior of your cluster change at all? In a few cases the above change resolved the alert problem entirely, but in some cases it did nothing. I would like to know your experience so that we can further track this bug down.
After making the above change, did the behavior of your cluster change at all? In a few cases the above change resolved the alert problem entirely, but in some cases it did nothing. I would like to know your experience so that we can further track this bug down.
Re: Some alerts not firing
I'm in the process of validating the alerts from last night after the change was made. ill let you know what we find. If you guys need any other information or want us to test something let us know. mi nagios logsever es su nagios logserverjolson wrote:This is a high priority issue, and you're not the only person experiencing it - with that said, we're certainly working on resolving it. I appreciate your patience.
After making the above change, did the behavior of your cluster change at all? In a few cases the above change resolved the alert problem entirely, but in some cases it did nothing. I would like to know your experience so that we can further track this bug down.
Re: Some alerts not firing
I'm looking forward to whatever you find out! I will let you know if we need you to do any testing. I have had trouble reproducing this bug, is there any chance you have any reliable steps to do so?
Re: Some alerts not firing
I've been having issues replicating this also. I have noticed more of these happen in the afternoon 4:00pm - 6:00pm ishjolson wrote:I'm looking forward to whatever you find out! I will let you know if we need you to do any testing. I have had trouble reproducing this bug, is there any chance you have any reliable steps to do so?
it looks like we are still having these issues after the change I found another skipped alert here on 10/6/15: I'm still sorting through the weeks data. Have you guys made any progress on this? We were planning on roiling this tool out into production but we cant continue until this is resolved.
Audit logs
Code: Select all
2015-10-06T06:01:03.839-07:00 ALERT Alert ID YPMyn56UTsSPNi9toCW23A returned OK: 0 matching entries found |logs=0;0;0
View: Table / JSON / Raw
Field Action Value Search
_id AVA9Pa3gXZbcqN-U9p3K
_index nagioslogserver_log
_type ALERT
created 1444136463839
message Alert ID YPMyn56UTsSPNi9toCW23A returned OK: 0 matching entries found |logs=0;0;0
source Nagios Log Server
type ALERT
2015-10-06T05:45:23.707-07:00 ALERT Alert ID YPMyn56UTsSPNi9toCW23A returned OK: 0 matching entries found |logs=0;0;0
View: Table / JSON / Raw
Field Action Value Search
_id AVA9L1V7XZbcqN-U9pa5
_index nagioslogserver_log
_type ALERT
created 1444135523707
message Alert ID YPMyn56UTsSPNi9toCW23A returned OK: 0 matching entries found |logs=0;0;0
source Nagios Log Server
type ALERT
You do not have the required permissions to view the files attached to this post.
Re: Some alerts not firing
heres another example on 10/5 around 5:00pm
Audit logs
Dashboard of alert:
Audit logs
Code: Select all
2015-10-05T17:04:22.643-07:00 ALERT Alert ID AU7VqkIgosxmGFOd5nSZ returned OK: 0 matching entries found |logs=0;0;0
View: Table / JSON / Raw
Field Action Value Search
_id AVA6dpnzXZbcqN-U9f04
_index nagioslogserver_log
_type ALERT
created 1444089862643
message Alert ID AU7VqkIgosxmGFOd5nSZ returned OK: 0 matching entries found |logs=0;0;0
source Nagios Log Server
type ALERT
2015-10-05T16:48:45.377-07:00 ALERT Alert ID AU7VqkIgosxmGFOd5nSZ returned OK: 0 matching entries found |logs=0;0;0
View: Table / JSON / Raw
Field Action Value Search
_id AVA6aEzCm6Hshcn6i6Yt
_index nagioslogserver_log
_type ALERT
created 1444088925377
message Alert ID AU7VqkIgosxmGFOd5nSZ returned OK: 0 matching entries found |logs=0;0;0
source Nagios Log Server
type ALERT
Dashboard of alert:
You do not have the required permissions to view the files attached to this post.
Re: Some alerts not firing
A new version of Nagios Log Server was released today that could very well deal with the problem you're experiencing.
You can download it here:
https://assets.nagios.com/downloads/nag ... 3.0.tar.gz
Upgrade instructions:
https://assets.nagios.com/downloads/nag ... Server.pdf
Please let me know if your problems persist after the upgrade. In addition to making alerting system fixes, we've refined the backup process further.
Changelog:
- Added ability to re-order table view -SW
- Added "Inspect" icon when using quick search -SW
- Change Audit Log to report Alert Name instead of ID -SW
- Fixed some missing translations -SW
- Fixed problem where index didn't exist before adding it to a query -SW
- Fixed bug where maintenance jobs were not run sequentially possible causing indexes to be deleted or closed before being backup -SW
- Fixed bug where IE was not redirecting window.location properly -SW
- Fixed bug where backup and maintenance process would not always complete all steps by re-ordering steps -SW
- Fixed bug causing incorrect index to be selected for alerts, specifically a problem when server timezone is offset from UTC -SW
- Fixed issue where logrotate had windows line endings and giving errors -JO
You can download it here:
https://assets.nagios.com/downloads/nag ... 3.0.tar.gz
Upgrade instructions:
https://assets.nagios.com/downloads/nag ... Server.pdf
Please let me know if your problems persist after the upgrade. In addition to making alerting system fixes, we've refined the backup process further.
Changelog:
- Added ability to re-order table view -SW
- Added "Inspect" icon when using quick search -SW
- Change Audit Log to report Alert Name instead of ID -SW
- Fixed some missing translations -SW
- Fixed problem where index didn't exist before adding it to a query -SW
- Fixed bug where maintenance jobs were not run sequentially possible causing indexes to be deleted or closed before being backup -SW
- Fixed bug where IE was not redirecting window.location properly -SW
- Fixed bug where backup and maintenance process would not always complete all steps by re-ordering steps -SW
- Fixed bug causing incorrect index to be selected for alerts, specifically a problem when server timezone is offset from UTC -SW
- Fixed issue where logrotate had windows line endings and giving errors -JO
Re: Some alerts not firing
Whooo hoo ill install this right away and let you guys know what I find.
jolson wrote:A new version of Nagios Log Server was released today that could very well deal with the problem you're experiencing.
You can download it here:
https://assets.nagios.com/downloads/nag ... 3.0.tar.gz
Upgrade instructions:
https://assets.nagios.com/downloads/nag ... Server.pdf
Please let me know if your problems persist after the upgrade. In addition to making alerting system fixes, we've refined the backup process further.
Changelog:
- Added ability to re-order table view -SW
- Added "Inspect" icon when using quick search -SW
- Change Audit Log to report Alert Name instead of ID -SW
- Fixed some missing translations -SW
- Fixed problem where index didn't exist before adding it to a query -SW
- Fixed bug where maintenance jobs were not run sequentially possible causing indexes to be deleted or closed before being backup -SW
- Fixed bug where IE was not redirecting window.location properly -SW
- Fixed bug where backup and maintenance process would not always complete all steps by re-ordering steps -SW
- Fixed bug causing incorrect index to be selected for alerts, specifically a problem when server timezone is offset from UTC -SW
- Fixed issue where logrotate had windows line endings and giving errors -JO