Page 1 of 1
Service check for most services are showing "Not scheduled"
Posted: Thu May 14, 2020 7:57 am
by IT-OPS-SYS
hi team,
we have couple of services which are showing as next check as : not scheduled.
I can see that these services have notifications enabled and active checks are enabled on them. Not sure about any settings which can update this parameter. i have attached the screenshot of one of the affected service.
Re: Service check for most services are showing "Not schedul
Posted: Thu May 14, 2020 1:29 pm
by benjaminsmith
Hello
@IT-OPS-SYS,
I'd like to check the configurations to make sure active checks are enabled. Please send us a fresh system profile, and let me know the names of the services reporting as not scheduled.
Thanks.
Benjamin
To send us your system profile.
Login to the Nagios XI GUI using a web browser
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and share this in a private message and then
reply to this post to bring it up in the queue.
Re: Service check for most services are showing "Not schedul
Posted: Fri May 15, 2020 1:57 am
by IT-OPS-SYS
mostly services names:
RDP Session
Service - BESClient
Service - MSSQL Agent
Service - MSSQL Browser
Service - MSSQL Server
Service - MSSQL VSS Writer
Disk Usage - C:
Disk Usage - D:
Disk Usage - E:
Disk Usage - F:
All the services which are associated to the below hosts:
bigfixinv
comptoolsql
corpcrmsql
corpemergesql
corprptsqlprod
corprptsqltest
corprptssisprod
corprptssistest
corprptssrsprod
lucix2013a
lucix2013b
malcmcas1
malcmps1
actually there are plenty of them but these will be enough for understanding the issue.
Moderator's Note: The profile has been shared with the support team but has been removed from the public forum.
Re: Service check for most services are showing "Not schedul
Posted: Fri May 15, 2020 4:28 pm
by benjaminsmith
Hi,
Thanks for the profile. In most cases, this type or error or condition is caused by an error or issue with the time periods (check_period) and the nagios process cannot determine how to schedule the next check.
There are quite a few errors in the messages log, but those hosts or services are different than the ones listed. Can you verify if the same condition exists for the following:
Code: Select all
Check of service 'CIMC StorageLocalDisk' on host 'orlxen001-cimc' could not be rescheduled properly. Scheduling check for Fri May 15 02:29:00 2020#012.
host 'phlxen001-cimc' could not be rescheduled properly. Scheduling check for Fri May 15 02:29:19 2020#012..
host 'cwpcomma002-cimc' could not be rescheduled properly. Scheduling check for Fri May 15 02:29:49 2020#012.
Next, let's run the following tail command:
Code: Select all
tail -f /usr/local/nagiosxi/var/cmdsubsys.log
Then force an immediate check from the Quick Actions menu on the details page for the service and post the full output from the tail command to the thread.
Thanks.
Benjamin
Re: Service check for most services are showing "Not schedul
Posted: Mon May 18, 2020 2:14 am
by IT-OPS-SYS
hi benjamin,
here is the output for 3 services which were reporting Next Check as Not scheduled:
root@cvrmnagiosxi001 ~]# tail -f /usr/local/nagiosxi/var/cmdsubsys.log
.......................................................PROCESSING COMMAND ID 42891...
PROCESS COMMAND: CMD=1132, DATA=/usr/local/nagios/libexec/check_nrpe -H 149.24.38.171 -u -t 90 -c check_qa18mains08_ss -a datatel.admin classr00m
CMDLINE=/usr/local/nagios/libexec/check_nrpe -H 149.24.38.171 -u -t 90 -c check_qa18mains08_ss -a datatel.admin classr00m
OUTPUT=NRPE: Unable to read output
RETURNCODE=3
......
PROCESSED 1 COMMANDS
............................................................
PROCESSED 0 COMMANDS
...........................................................
PROCESSED 0 COMMANDS
..........PROCESSING COMMAND ID 42892...
PROCESS COMMAND: CMD=16, DATA={"host_name":"bigfixinv","service_name":"Service - MSSQL Agent","cmd":54,"start_time":1589785869}
COMMAND DATA: {"host_name":"bigfixinv","service_name":"Service - MSSQL Agent","cmd":54,"start_time":1589785869}
CMDARR:
Array
(
[host_name] => bigfixinv
[service_name] => Service - MSSQL Agent
[cmd] => 54
[start_time] => 1589785869
)
CORE CMD: SCHEDULE_FORCED_SVC_CHECK;bigfixinv;Service - MSSQL Agent;1589785869
SUBMITTING A NAGIOSCORE COMMAND...
.................................................PROCESSING COMMAND ID 42893...
PROCESS COMMAND: CMD=16, DATA={"host_name":"bigfixinv","service_name":"Service - MSSQL Server","cmd":54,"start_time":1589785918}
COMMAND DATA: {"host_name":"bigfixinv","service_name":"Service - MSSQL Server","cmd":54,"start_time":1589785918}
CMDARR:
Array
(
[host_name] => bigfixinv
[service_name] => Service - MSSQL Server
[cmd] => 54
[start_time] => 1589785918
)
CORE CMD: SCHEDULE_FORCED_SVC_CHECK;bigfixinv;Service - MSSQL Server;1589785918
SUBMITTING A NAGIOSCORE COMMAND...
.
PROCESSED 2 COMMANDS
................PROCESSING COMMAND ID 42894...
PROCESS COMMAND: CMD=16, DATA={"host_name":"bigfixinv","service_name":"Service - MSSQL Browser","cmd":54,"start_time":1589785935}
COMMAND DATA: {"host_name":"bigfixinv","service_name":"Service - MSSQL Browser","cmd":54,"start_time":1589785935}
CMDARR:
Array
(
[host_name] => bigfixinv
[service_name] => Service - MSSQL Browser
[cmd] => 54
[start_time] => 1589785935
)
CORE CMD: SCHEDULE_FORCED_SVC_CHECK;bigfixinv;Service - MSSQL Browser;1589785935
SUBMITTING A NAGIOSCORE COMMAND...
Re: Service check for most services are showing "Not schedul
Posted: Mon May 18, 2020 3:00 pm
by benjaminsmith
Hi,
Thanks for posting the logs. The 'OUTPUT=NRPE: Unable to read output' is a separate issue typically not related to the issue with checks not being scheduled, the other output does look ok. The checks are being passed to the Nagios process.
Looking over the configs, I did notice this issue with check_periods. The check_period for the following service is set to us-holidays.
Code: Select all
define service {
host_name blrsan001-cimc
service_description CIMC StorageControllerProps
check_period us-holidays
check_command Cisco IMC monitor!admin!!storageControllerProps!date_of_manufacture!!!!
So, for this service actives check will only be scheduled during the following times:
Code: Select all
define timeperiod {
timeperiod_name us-holidays
alias U.S. Holidays
name us-holidays
january 1 00:00-00:00
monday 1 september 00:00-00:00
july 4 00:00-00:00
thursday -1 november 00:00-00:00
december 25 00:00-00:00
}
I would recommend using the
Bulk Mods Tool to reset any host or service your having this issue with to a new timeperiod, and let me know if the issue is resolved.
Also, here's a few commands to verify the PHP and system time settings are correct.
Code: Select all
checking time / date on systems
grep "date.timezone" /etc/php.ini
ls -l /etc/localtime
date
Benjamin
Re: Service check for most services are showing "Not schedul
Posted: Tue May 19, 2020 5:36 am
by IT-OPS-SYS
there are hosts which are on right check_period as below but still facing same issues:
define host {
host_name corpemergesql
use common-host-windows,vmware-ffx1vc
address corpemergesql.ellucian.com
max_check_attempts 3
check_interval 5
retry_interval 1
check_period xi_timeperiod_24x7
contact_groups ITOPS-P2
notification_interval 0
notification_period xi_timeperiod_24x7
first_notification_delay 5
notification_options d,u,r,f,s,
notifications_enabled 1
_VCENTER_OBJECT corpemergesql
register 1
}
define host {
host_name corpcrmsql
use common-host-windows,vmware-ffx1vc
address corpcrmsql.ellucian.com
max_check_attempts 3
check_interval 5
retry_interval 1
check_period xi_timeperiod_24x7
contact_groups IT-Microsoft-SharePoint-Administrators,ITOPS-P2
notification_interval 0
notification_period xi_timeperiod_24x7
first_notification_delay 5
notification_options d,u,r,f,s,
notifications_enabled 1
_VCENTER_OBJECT corpcrmsql
register 1
}
###############################################################################
even when i checked for the services as well, they are showing correct check_period:
define service {
host_name comptoolsql
service_description Service - MSSQL Agent
check_command check_nrpe!check_service!-a 'service=SQLAgent$COMPTOOL_SQL' show-all 'perf-config=*(ignored:true)'!!!!!!
max_check_attempts 5
check_interval 10
retry_interval 1
check_period xi_timeperiod_24x7
notification_interval 0
first_notification_delay 0
notification_period xi_timeperiod_24x7
register 1
}
define service {
host_name comptoolsql
service_description Service - MSSQL Server
check_command check_nrpe!check_service!-a 'service=MSSQL$COMPTOOL_SQL' show-all 'perf-config=*(ignored:true)'!!!!!!
max_check_attempts 5
check_interval 10
retry_interval 1
check_period xi_timeperiod_24x7
notification_interval 0
first_notification_delay 0
notification_period xi_timeperiod_24x7
register 1
}
Re: Service check for most services are showing "Not schedul
Posted: Tue May 19, 2020 2:38 pm
by benjaminsmith
Hi,
I just re-checked the configurations on those hosts and they do look correct. I'd like to review the data for those hosts in the status.dat file on the server.
First, let's check the permissions on that file, post the output of the command below, and then upload or PM the file for us to review. Thanks.
Code: Select all
ls -la /usr/local/nagios/var/status.dat
Re: Service check for most services are showing "Not schedul
Posted: Fri May 22, 2020 2:12 am
by IT-OPS-SYS
i dont have this file on the server
inside the var directory , i see below files:
[root@cvrmnagiosxi001 var]# ll
total 165508
drwxrwxr-x 2 nagios nagios 53248 May 22 00:00 archives
-rw-r--r-- 1 root root 26971 Oct 21 2019 et nu
-rw-r--r-- 1 nagios nagios 0 Oct 13 2016 host-perfdata
-rw-r--r-- 1 nagios nagios 21348 May 21 09:54 nagios.configtest
-rw-r--r-- 1 nagios nagios 1664754 May 22 03:11 nagios.log
-rw-rw-r-- 1 nagios nagios 29283439 Jul 12 2019 nagios.tmp50MZEy
-rw-rw-r-- 1 nagios nagios 16319979 Oct 11 2018 nagios.tmpcVNIdw
-rw-rw-r-- 1 nagios nagios 19833946 Sep 6 2019 nagios.tmphcmeqI
-rw-rw-r-- 1 nagios nagios 15760783 Sep 5 2017 nagios.tmpiDhXdg
-rw-rw-r-- 1 nagios nagios 16878715 Jul 28 2018 nagios.tmpmr7j2g
-rw-rw-r-- 1 nagios nagios 19853745 Sep 6 2019 nagios.tmpo3vQM4
-rw-rw-r-- 1 nagios nagios 3591143 Jun 2 2017 nagios.tmpQ9bN9M
-rw-r--r-- 1 nagios nagios 317805 May 22 03:11 ndo2db.debug
-rw-r--r-- 1 nagios nagios 1000202 May 22 03:11 ndo2db.debug.old
-rw-r--r-- 1 nagios nagios 5 May 15 21:19 ndo2db.lock
-rw-r--r-- 1 nagios nagios 0 May 21 09:54 ndomod.tmp
srwxr-xr-x 1 nagios nagios 0 May 15 21:19 ndo.sock
-rw-r--r-- 1 nagios nagios 556048 May 21 07:17 npcd.log
-rw-r--r-- 1 nagios nagios 10485821 Apr 3 07:31 npcd.log.old
-rw-r--r-- 1 nagios nagios 5 May 15 21:19 nrpe.pid
-rw-r--r-- 1 nagios nagios 32419 Oct 13 2016 objects.cache
-rw-r--r-- 1 nagios nagios 11885005 May 21 09:54 objects.precache
-rw-rw-r-- 1 nagios nagios 2693593 May 21 07:17 perfdata.log
-rw------- 1 nagios nagios 18988888 May 22 02:54 retention.dat
drwxrwsr-x 2 nagios nagcmd 39 May 21 09:54 rw
-rw-r--r-- 1 nagios nagios 0 Oct 13 2016 service-perfdata
-rw-r--r-- 1 root root 8192 Oct 22 2019 set nu
drwxr-xr-x 5 nagios nagios 52 Oct 13 2016 spool
drwxr-xr-x 2 nagios nagios 46 May 22 03:11 stats
Re: Service check for most services are showing "Not schedul
Posted: Fri May 22, 2020 11:58 am
by ssax
You have a RAMDisk setup, see here:
Please include the output of these commands as well:
Code: Select all
ls -ld /usr/local/nagios/var/rw
ls -l /usr/local/nagios/var/rw
chage -l nagios
grep nag /etc/group