Page 1 of 1
Alerts are not running automatically
Posted: Tue Sep 19, 2017 1:03 pm
by ssoliveira
Execution Alerts are not running automatically; respecting the parameter "Check interval".
I have 3 alerts, which analyzes the logs regarding the delay of delivery of email.
Alerts analyze a period of 1 hour of logs; They are parameterized to perform the analyzes every 5 minutes. However, the execution of the analyzes is not occurring. I need executions every 5 minutes, so that the problem is solved automatically in nagios-xi, as the environment normalizes.
I tried other paramatriations, such as 1 hour of "Check Interval" and 1 hour of "Search Period", but an analysis to work manually.
Apparently an automatic routine is not working.
As shown in the pictures; the last execution was yesterday. When I was setting up the service.
Can you help me; to fix it?
Thank you.
Re: Alerts are not running automatically
Posted: Tue Sep 19, 2017 1:20 pm
by cdienger
Change the
0 to a
1: to send a warning if there are no matching lines.
https://nagios-plugins.org/doc/guidelin ... HOLDFORMAT
Re: Alerts are not running automatically
Posted: Tue Sep 19, 2017 6:33 pm
by ssoliveira
Hi, the problem is not the threshold.
The problem is that the execution of the analysis is not occurring automatically, it only works if I click on the manual execution icon.
It is in 1, and still does not execute automatically, respecting the parameter "Check interval"
Re: Alerts are not running automatically
Posted: Wed Sep 20, 2017 11:18 am
by cdienger
What is the status of the run_all_alerts job and it's frequency set to under Administration > System > Command Subsystem? Try clearing the current status with "Reset All Jobs" button. If it's still failing after this, make sure the cron job is running with the command:
You should see a line like:
Code: Select all
/bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
Also, the command:
Code: Select all
tail -f /usr/local/nagioslogserver/var/jobs.log
Should also be run. Every 20 seconds(or dicated by the frequency set on the run_all_alerts job) you should see an entry like:
Code: Select all
Running command run_alerts with args ' ' for job id: run_all_alerts
If not, then
/var/log/cron log should be reviewed as well as
/etc/cron.d/nagioslogserver. /etc/cron.d/nagioslogserver should contain:
Code: Select all
* * * * * nagios /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php poller > /usr/local/nagioslogserver/var/poller.log 2>&1
* * * * * nagios /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
Re: Alerts are not running automatically
Posted: Wed Sep 20, 2017 6:31 pm
by ssoliveira
Hello .. Good evening,
Job run_all_alerts was running status, with the next run for the 24th of last month.
After clicking the "Reset All Jobs" button, the Job was left with "Waiting" status for a few seconds and then changed to "Waiting" with "Last Run Status" in "SUCCESS"
I ran the "ps aux | grep jobs" command on the 4 servers.
3 returned information, and 1 of them did not, according to the following data.
Should there be a job running on all servers? Or is there the concept of master?
Code: Select all
[root@datalog-ugt-log1 ~]#
[root@datalog-ugt-log1 ~]#
[root@datalog-ugt-log1 ~]# ps aux | grep jobs
nagios 10493 0.0 0.0 113120 1208 ? Ss 20:15 0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var jobs.log 2>&1
nagios 10495 0.1 0.0 262028 14732 ? S 20:15 0:00 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
root 10652 0.0 0.0 112648 968 pts/0 R+ 20:15 0:00 grep --color=auto jobs
[root@datalog-ugt-log1 ~]#
[root@datalog-ugt-log1 ~]#
[root@datalog-ugt-log1 ~]# ssh datalog-ugt-log2
Last login: Thu Aug 31 17:57:30 2017 from 172.23.3.149
[root@datalog-ugt-log2 ~]#
[root@datalog-ugt-log2 ~]# ps aux | grep jobs
nagios 4790 0.0 0.0 113120 1192 ? Ss 20:15 0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var jobs.log 2>&1
nagios 4793 0.1 0.0 240140 14088 ? S 20:15 0:00 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
root 4947 0.0 0.0 112648 968 pts/0 S+ 20:15 0:00 grep --color=auto jobs
[root@datalog-ugt-log2 ~]#
[root@datalog-ugt-log2 ~]#
[root@datalog-ugt-log2 ~]# ssh datalog-utb-log1
Last login: Thu Aug 31 17:23:56 2017 from datalog-ugt-log1
[root@datalog-utb-log1 ~]#
[root@datalog-utb-log1 ~]# ps aux | grep jobs
root 29678 0.0 0.0 112648 968 pts/0 S+ 20:16 0:00 grep --color=auto jobs
[root@datalog-utb-log1 ~]#
[root@datalog-utb-log1 ~]#
[root@datalog-utb-log1 ~]# ssh datalog-utb-log2
Last login: Thu Aug 31 17:50:03 2017 from datalog-ugt-log2
[root@datalog-utb-log2 ~]#
[root@datalog-utb-log2 ~]#
[root@datalog-utb-log2 ~]# ps aux | grep jobs
nagios 25685 0.0 0.0 113120 1208 ? Ss 20:16 0:00 /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var jobs.log 2>&1
nagios 25687 0.1 0.0 240140 14088 ? S 20:16 0:00 /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs
root 25787 0.0 0.0 112648 968 pts/0 S+ 20:16 0:00 grep --color=auto jobs
[root@datalog-utb-log2 ~]#
Run the job manually on the 4 servers.
When executing in the first datalog-ugt-log1 the command did not return information, remaining locked (running) for minutes, until I cancel with "crtl + c"
Is this behavior expected? Does it keep running indefinitely? Or should the requisition on this page start and end?
Code: Select all
[root@datalog-ugt-log1 ~]#
[root@datalog-ugt-log1 ~]# /bin/sh -c /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
^C
[root@datalog-ugt-log1 ~]# tail -f /usr/local/nagioslogserver/var/jobs.log
Running command run_alerts with args ' ' for job id: run_all_alerts
SUCCESS
Contents of the /etc/cron.d/nagioslogserver file on 4 servers.
Code: Select all
[root@datalog-ugt-log1 ~]# ssh datalog-ugt-log1 cat /etc/cron.d/nagioslogserver
# /etc/cron.d/nagioslogserver: crontab fragment for nagioslogserver
* * * * * nagios /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php poller > /usr/local/nagioslogserver/var/poller.log 2>&1
* * * * * nagios /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
[root@datalog-ugt-log1 ~]# ssh datalog-ugt-log2 cat /etc/cron.d/nagioslogserver
# /etc/cron.d/nagioslogserver: crontab fragment for nagioslogserver
* * * * * nagios /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php poller > /usr/local/nagioslogserver/var/poller.log 2>&1
* * * * * nagios /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
[root@datalog-ugt-log1 ~]# ssh datalog-utb-log1 cat /etc/cron.d/nagioslogserver
# /etc/cron.d/nagioslogserver: crontab fragment for nagioslogserver
* * * * * nagios /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php poller > /usr/local/nagioslogserver/var/poller.log 2>&1
* * * * * nagios /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
[root@datalog-ugt-log1 ~]# ssh datalog-utb-log2 cat /etc/cron.d/nagioslogserver
# /etc/cron.d/nagioslogserver: crontab fragment for nagioslogserver
* * * * * nagios /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php poller > /usr/local/nagioslogserver/var/poller.log 2>&1
* * * * * nagios /usr/bin/php -q /var/www/html/nagioslogserver/www/index.php jobs > /usr/local/nagioslogserver/var/jobs.log 2>&1
Re: Alerts are not running automatically
Posted: Thu Sep 21, 2017 9:33 am
by scottwilkerson
Click the button on the last image you uploaded that says "Reset All Jobs"
Somehow the run_azll_alerts got stuck in a running state.