We have been experiencing issues with the job to trigger alerts keeps on failing. We setup everything to validate the alerts that came in over the weekend to compare it to our current systems and after about an hour the jobs to run the alerts stopped. I reset all jobs at 4:30pm and the jobs failed roughly an hour after that. We have 1866 alerts that should be running. Is there anywhere that I can look to try and figure this out? Thank you.
Jobs failing.
Jobs failing.
You do not have the required permissions to view the files attached to this post.
Re: Jobs failing.
It's possible that all of your alerts take longer than 20 seconds to run in full. Try changing the frequency to every 1 minute and see if that makes a difference. Thank you!
Re: Jobs failing.
I'm setting it to 1 minute. We will see if it stays stable. Thank you.jolson wrote:It's possible that all of your alerts take longer than 20 seconds to run in full. Try changing the frequency to every 1 minute and see if that makes a difference. Thank you!
Re: Jobs failing.
looks like the alert jobs failed again. I set it to run every 3 minutes. I noticed the next run jumped to some random time. 21:48:24 its currently 18:53:00jolson wrote:That sounds good - let us know if it helps.
You do not have the required permissions to view the files attached to this post.
Re: Jobs failing.
Jklre wrote:looks like the alert jobs failed again. I set it to run every 3 minutes. I noticed the next run jumped to some random time. 21:48:24 its currently 18:53:00jolson wrote:That sounds good - let us know if it helps.
Weird I think the NTP time is messed up on these boxes. I just reset the jobs and its stating the time is 16:42:51
Re: Jobs failing.
You can try this out to change the timezone on several services at once, if you haven't already:
Feel free to replace 'America/Chicago' with a location of your choice.
Also, be sure to get NTP sync'd up properly. I would restart all of your processes after the time has been corrected.
Code: Select all
cd /usr/local/nagioslogserver/scripts/
./change_timezone.sh -z America/ChicagoAlso, be sure to get NTP sync'd up properly. I would restart all of your processes after the time has been corrected.
Code: Select all
service elasticsearch restart
service logstash restart
service crond restart
service httpd restartRe: Jobs failing.
So far it seems stable. I set the timezones and setup NTP sync. The jobs are set to run every 3 minutes and that seems to be good so far.jolson wrote:You can try this out to change the timezone on several services at once, if you haven't already:Feel free to replace 'America/Chicago' with a location of your choice.Code: Select all
cd /usr/local/nagioslogserver/scripts/ ./change_timezone.sh -z America/Chicago
Also, be sure to get NTP sync'd up properly. I would restart all of your processes after the time has been corrected.Code: Select all
service elasticsearch restart service logstash restart service crond restart service httpd restart
Re: Jobs failing.
That's great news - I'll keep the thread open in case there are any further problems.