Jobs failing.

This support forum board is for support questions relating to Nagios Log Server, our solution for managing and monitoring critical log data.
Locked
Jklre
Posts: 163
Joined: Wed May 28, 2014 1:56 pm

Jobs failing.

Post by Jklre »

We have been experiencing issues with the job to trigger alerts keeps on failing. We setup everything to validate the alerts that came in over the weekend to compare it to our current systems and after about an hour the jobs to run the alerts stopped. I reset all jobs at 4:30pm and the jobs failed roughly an hour after that. We have 1866 alerts that should be running. Is there anywhere that I can look to try and figure this out? Thank you.
ss.jpg
You do not have the required permissions to view the files attached to this post.
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Jobs failing.

Post by jolson »

It's possible that all of your alerts take longer than 20 seconds to run in full. Try changing the frequency to every 1 minute and see if that makes a difference. Thank you!
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
Jklre
Posts: 163
Joined: Wed May 28, 2014 1:56 pm

Re: Jobs failing.

Post by Jklre »

jolson wrote:It's possible that all of your alerts take longer than 20 seconds to run in full. Try changing the frequency to every 1 minute and see if that makes a difference. Thank you!
I'm setting it to 1 minute. We will see if it stays stable. Thank you.
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Jobs failing.

Post by jolson »

That sounds good - let us know if it helps.
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
Jklre
Posts: 163
Joined: Wed May 28, 2014 1:56 pm

Re: Jobs failing.

Post by Jklre »

jolson wrote:That sounds good - let us know if it helps.
looks like the alert jobs failed again. I set it to run every 3 minutes. I noticed the next run jumped to some random time. 21:48:24 its currently 18:53:00
ss4.jpg
You do not have the required permissions to view the files attached to this post.
Jklre
Posts: 163
Joined: Wed May 28, 2014 1:56 pm

Re: Jobs failing.

Post by Jklre »

Jklre wrote:
jolson wrote:That sounds good - let us know if it helps.
looks like the alert jobs failed again. I set it to run every 3 minutes. I noticed the next run jumped to some random time. 21:48:24 its currently 18:53:00
ss4.jpg

Weird I think the NTP time is messed up on these boxes. I just reset the jobs and its stating the time is 16:42:51
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Jobs failing.

Post by jolson »

You can try this out to change the timezone on several services at once, if you haven't already:

Code: Select all

cd /usr/local/nagioslogserver/scripts/
./change_timezone.sh -z America/Chicago
Feel free to replace 'America/Chicago' with a location of your choice.

Also, be sure to get NTP sync'd up properly. I would restart all of your processes after the time has been corrected.

Code: Select all

service elasticsearch restart
service logstash restart
service crond restart
service httpd restart
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
Jklre
Posts: 163
Joined: Wed May 28, 2014 1:56 pm

Re: Jobs failing.

Post by Jklre »

jolson wrote:You can try this out to change the timezone on several services at once, if you haven't already:

Code: Select all

cd /usr/local/nagioslogserver/scripts/
./change_timezone.sh -z America/Chicago
Feel free to replace 'America/Chicago' with a location of your choice.

Also, be sure to get NTP sync'd up properly. I would restart all of your processes after the time has been corrected.

Code: Select all

service elasticsearch restart
service logstash restart
service crond restart
service httpd restart
So far it seems stable. I set the timezones and setup NTP sync. The jobs are set to run every 3 minutes and that seems to be good so far.
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Jobs failing.

Post by jolson »

That's great news - I'll keep the thread open in case there are any further problems.
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
Locked