Execution Alerts are not running automatically; respecting the parameter "Check interval".
I have 3 alerts, which analyzes the logs regarding the delay of delivery of email.
Alerts analyze a period of 1 hour of logs; They are parameterized to perform the analyzes every 5 minutes. However, the execution of the analyzes is not occurring. I need executions every 5 minutes, so that the problem is solved automatically in nagios-xi, as the environment normalizes.
I tried other paramatriations, such as 1 hour of "Check Interval" and 1 hour of "Search Period", but an analysis to work manually.
Apparently an automatic routine is not working.
As shown in the pictures; the last execution was yesterday. When I was setting up the service.
Can you help me; to fix it?
Thank you.
Alerts are not running automatically
-
ssoliveira
- Posts: 91
- Joined: Wed Dec 07, 2016 6:02 pm
Alerts are not running automatically
You do not have the required permissions to view the files attached to this post.
Re: Alerts are not running automatically
Change the 0 to a 1: to send a warning if there are no matching lines.
https://nagios-plugins.org/doc/guidelin ... HOLDFORMAT
https://nagios-plugins.org/doc/guidelin ... HOLDFORMAT
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
-
ssoliveira
- Posts: 91
- Joined: Wed Dec 07, 2016 6:02 pm
Re: Alerts are not running automatically
Hi, the problem is not the threshold.
The problem is that the execution of the analysis is not occurring automatically, it only works if I click on the manual execution icon.
It is in 1, and still does not execute automatically, respecting the parameter "Check interval"
The problem is that the execution of the analysis is not occurring automatically, it only works if I click on the manual execution icon.
It is in 1, and still does not execute automatically, respecting the parameter "Check interval"
Re: Alerts are not running automatically
What is the status of the run_all_alerts job and it's frequency set to under Administration > System > Command Subsystem? Try clearing the current status with "Reset All Jobs" button. If it's still failing after this, make sure the cron job is running with the command:
You should see a line like:
Also, the command:
Should also be run. Every 20 seconds(or dicated by the frequency set on the run_all_alerts job) you should see an entry like:
If not, then /var/log/cron log should be reviewed as well as /etc/cron.d/nagioslogserver. /etc/cron.d/nagioslogserver should contain:
Code: Select all
ps aux | grep jobs
Code: Select all
/bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
Code: Select all
tail -f /usr/local/nagioslogserver/var/jobs.log
Code: Select all
Running command run_alerts with args ' ' for job id: run_all_alerts
If not, then /var/log/cron log should be reviewed as well as /etc/cron.d/nagioslogserver. /etc/cron.d/nagioslogserver should contain:
Code: Select all
* * * * * nagios /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php poller > /usr/local/nagioslogserver/var/poller.log 2>&1
* * * * * nagios /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
-
ssoliveira
- Posts: 91
- Joined: Wed Dec 07, 2016 6:02 pm
Re: Alerts are not running automatically
Hello .. Good evening,
Job run_all_alerts was running status, with the next run for the 24th of last month.
After clicking the "Reset All Jobs" button, the Job was left with "Waiting" status for a few seconds and then changed to "Waiting" with "Last Run Status" in "SUCCESS"
I ran the "ps aux | grep jobs" command on the 4 servers.
3 returned information, and 1 of them did not, according to the following data.
Should there be a job running on all servers? Or is there the concept of master?
Run the job manually on the 4 servers.
When executing in the first datalog-ugt-log1 the command did not return information, remaining locked (running) for minutes, until I cancel with "crtl + c"
Is this behavior expected? Does it keep running indefinitely? Or should the requisition on this page start and end?
Contents of the /etc/cron.d/nagioslogserver file on 4 servers.
Job run_all_alerts was running status, with the next run for the 24th of last month.
After clicking the "Reset All Jobs" button, the Job was left with "Waiting" status for a few seconds and then changed to "Waiting" with "Last Run Status" in "SUCCESS"
I ran the "ps aux | grep jobs" command on the 4 servers.
3 returned information, and 1 of them did not, according to the following data.
Should there be a job running on all servers? Or is there the concept of master?
Code: Select all
[root@datalog-ugt-log1 ~]#
[root@datalog-ugt-log1 ~]#
[root@datalog-ugt-log1 ~]# ps aux | grep jobs
nagios 10493 0.0 0.0 113120 1208 ? Ss 20:15 0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var jobs.log 2>&1
nagios 10495 0.1 0.0 262028 14732 ? S 20:15 0:00 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
root 10652 0.0 0.0 112648 968 pts/0 R+ 20:15 0:00 grep --color=auto jobs
[root@datalog-ugt-log1 ~]#
[root@datalog-ugt-log1 ~]#
[root@datalog-ugt-log1 ~]# ssh datalog-ugt-log2
Last login: Thu Aug 31 17:57:30 2017 from 172.23.3.149
[root@datalog-ugt-log2 ~]#
[root@datalog-ugt-log2 ~]# ps aux | grep jobs
nagios 4790 0.0 0.0 113120 1192 ? Ss 20:15 0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var jobs.log 2>&1
nagios 4793 0.1 0.0 240140 14088 ? S 20:15 0:00 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
root 4947 0.0 0.0 112648 968 pts/0 S+ 20:15 0:00 grep --color=auto jobs
[root@datalog-ugt-log2 ~]#
[root@datalog-ugt-log2 ~]#
[root@datalog-ugt-log2 ~]# ssh datalog-utb-log1
Last login: Thu Aug 31 17:23:56 2017 from datalog-ugt-log1
[root@datalog-utb-log1 ~]#
[root@datalog-utb-log1 ~]# ps aux | grep jobs
root 29678 0.0 0.0 112648 968 pts/0 S+ 20:16 0:00 grep --color=auto jobs
[root@datalog-utb-log1 ~]#
[root@datalog-utb-log1 ~]#
[root@datalog-utb-log1 ~]# ssh datalog-utb-log2
Last login: Thu Aug 31 17:50:03 2017 from datalog-ugt-log2
[root@datalog-utb-log2 ~]#
[root@datalog-utb-log2 ~]#
[root@datalog-utb-log2 ~]# ps aux | grep jobs
nagios 25685 0.0 0.0 113120 1208 ? Ss 20:16 0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var jobs.log 2>&1
nagios 25687 0.1 0.0 240140 14088 ? S 20:16 0:00 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
root 25787 0.0 0.0 112648 968 pts/0 S+ 20:16 0:00 grep --color=auto jobs
[root@datalog-utb-log2 ~]#
When executing in the first datalog-ugt-log1 the command did not return information, remaining locked (running) for minutes, until I cancel with "crtl + c"
Is this behavior expected? Does it keep running indefinitely? Or should the requisition on this page start and end?
Code: Select all
[root@datalog-ugt-log1 ~]#
[root@datalog-ugt-log1 ~]# /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
^C
[root@datalog-ugt-log1 ~]# tail -f /usr/local/nagioslogserver/var/jobs.log
Running command run_alerts with args ' ' for job id: run_all_alerts
SUCCESS
Code: Select all
[root@datalog-ugt-log1 ~]# ssh datalog-ugt-log1 cat /etc/cron.d/nagioslogserver
# /etc/cron.d/nagioslogserver: crontab fragment for nagioslogserver
* * * * * nagios /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php poller > /usr/local/nagioslogserver/var/poller.log 2>&1
* * * * * nagios /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
[root@datalog-ugt-log1 ~]# ssh datalog-ugt-log2 cat /etc/cron.d/nagioslogserver
# /etc/cron.d/nagioslogserver: crontab fragment for nagioslogserver
* * * * * nagios /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php poller > /usr/local/nagioslogserver/var/poller.log 2>&1
* * * * * nagios /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
[root@datalog-ugt-log1 ~]# ssh datalog-utb-log1 cat /etc/cron.d/nagioslogserver
# /etc/cron.d/nagioslogserver: crontab fragment for nagioslogserver
* * * * * nagios /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php poller > /usr/local/nagioslogserver/var/poller.log 2>&1
* * * * * nagios /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
[root@datalog-ugt-log1 ~]# ssh datalog-utb-log2 cat /etc/cron.d/nagioslogserver
# /etc/cron.d/nagioslogserver: crontab fragment for nagioslogserver
* * * * * nagios /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php poller > /usr/local/nagioslogserver/var/poller.log 2>&1
* * * * * nagios /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
You do not have the required permissions to view the files attached to this post.
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Alerts are not running automatically
Click the button on the last image you uploaded that says "Reset All Jobs"
Somehow the run_azll_alerts got stuck in a running state.
Somehow the run_azll_alerts got stuck in a running state.