Page 1 of 1

NSCA checks

Posted: Thu Feb 25, 2016 7:56 am
by tejanagios
HI,
Every time I hit apply on the configuration, the entire service details page reloads and changes its state to pending. ( don't know why or where to look for this behaviour) ( screenshot attached)

Every time the services load, they trigger new notifications about failed or down services.

this started happening after i configured NSCA for nagios for one of the host for testing purposes.

This is the service definition on one of the services

define service {
host_name qa-wts1.jlr.ktsecureqa.co.uk
service_description cpu
use xiwizard_passive_service
max_check_attempts 1
check_interval 1
retry_interval 1
check_period xi_timeperiod_24x7
notification_interval 60
notification_period xi_timeperiod_24x7
contacts teja
stalking_options n
_xiwizard passiveobject
register 1
}

Could you please tell me where to look or check to correct this.

also, can I check external commands status using passive checks? using NSCA just like we use NRPE ?

Re: NSCA checks

Posted: Thu Feb 25, 2016 6:07 pm
by bwallace
Try checking in:

/usr/local/nagiosxi/var/cmdsubsys.log
- This file logs the process of any commands passed to the Nagios XI backend/subsystem through the cmdsubsys cron. This includes
“Apply Configuration” or other Nagios XI specific commands. These commands are read by the cron from the nagiosxi postgres
database table "xi_commands" -

/usr/local/nagios/var/nagios.log
- The Nagios Core log, includes checks, notifications, external commands, and events. This file is rotated daily into the
/usr/local/nagios/var/archives folder (default setting in nagios.cfg) by rsyslog -

/var/log/httpd/error_log
- This is the Apache error log. Problems/bugs in the php will log here, as well as authentication problems or issues with broken urls. As
Nagios XI is a LAMP application, many issues will log here. It is always a good place to start when troubleshooting Nagios XI -


If in any doubt, feel free to post the logs here and we'd be happy to take a look. If you prefer to PM the logs to us, then you might as well generate a profile instead (reproduce the issue first):
In the XI web interface, go to Admin -> System Profile and click the blue "Download Profile" button.

Re: NSCA checks

Posted: Thu Feb 25, 2016 6:10 pm
by lmiltchev
Is this:

Code: Select all

stalking_options n
a typo?

It doesn't seem like a valid stalking option. See the usage below:
Service - stalking options

This directive determines which service states "stalking" is enabled for. Valid options are a combination of one or more of the following:
o = stalk on OK states,
w = stalk on WARNING states,
u = stalk on UNKNOWN states, and
c = stalk on CRITICAL states.
Have you modified the "xiwizard_passive_service" template in any way?

Re: NSCA checks

Posted: Fri Feb 26, 2016 6:18 am
by tejanagios
HI,

I haven't although these are definition for objects right now.

define service {
host_name qa-wts1.jlr.ktsecureqa.co.uk
service_description cpu
use xiwizard_passive_service
max_check_attempts 1
check_interval 1
retry_interval 1
check_period xi_timeperiod_24x7
notification_interval 60
notification_period xi_timeperiod_24x7
contacts teja
stalking_options n
_xiwizard passiveobject
register 1
}

## template

define service {
name xiwizard_passive_service
service_description Passive Service
use xiwizard_generic_service
check_command check_dummy!0!"No data received yet."
is_volatile 0
initial_state o
max_check_attempts 1
active_checks_enabled 0
passive_checks_enabled 1
flap_detection_enabled 0
stalking_options o,w,u,c
register 0

1. stalking options are not being propagated correctly to the services. Don't know from where its getting 'n' from
2. Why has the service got a variable definition here ?

3. Tried to implement real time eventlog monitoring using NSCA, the blog doesn't show how to add the service once everything is configured. Once after the settings are configured accordingly, does the service appear in the un-configured objects settings tab with in nagios xi and then, I add the service ?

Re: NSCA checks

Posted: Fri Feb 26, 2016 3:09 pm
by rkennedy
Can you also post the template for this? xiwizard_generic_service? To answer 1 & 2, we need to find out where the n is coming from.

Re: NSCA checks

Posted: Mon Feb 29, 2016 8:08 am
by tejanagios
Here is the Template:

define service {
name generic-service
service_description checks windows updates
is_volatile 0
max_check_attempts 5
check_interval 5
retry_interval 1
active_checks_enabled 1
passive_checks_enabled 1
check_period 24x7
parallelize_check 1
obsess_over_service 0
check_freshness 0
event_handler_enabled 1
flap_detection_enabled 0
process_perf_data 1
retain_status_information 0
retain_nonstatus_information 0
notification_interval 60
first_notification_delay 0
notification_period 24x7
notification_options c,u,r,
notifications_enabled 1
contacts nagiosadmin
register 0

}

Re: NSCA checks

Posted: Mon Feb 29, 2016 2:30 pm
by rkennedy
Doesn't look like it's there, either. Can you post your objects.cache file? It's located at /usr/local/nagios/var/objects.cache.

Re: NSCA checks

Posted: Thu Mar 03, 2016 5:22 am
by tejanagios
Yes, looks like the cache objects file has this, the 'n' option is added to all the services, including local host


define service {
host_name localhost
service_description Total Processes
check_period 24x7
check_command check_local_procs!400!500!RSZDT
contacts nagiosadmin
notification_period 24x7
initial_state o
importance 0
check_interval 5.000000
retry_interval 1.000000
max_check_attempts 4
is_volatile 0
parallelize_check 1
active_checks_enabled 1
passive_checks_enabled 1
obsess 0
event_handler_enabled 1
low_flap_threshold 0.000000
high_flap_threshold 0.000000
flap_detection_enabled 0
flap_detection_options a
freshness_threshold 0
check_freshness 0
notification_options r,u,c
notifications_enabled 1
notification_interval 60.000000
first_notification_delay 0.000000
stalking_options n
process_perf_data 1
retain_status_information 0
retain_nonstatus_information 0
}

Re: NSCA checks

Posted: Thu Mar 03, 2016 12:23 pm
by rkennedy
Can you PM your profile over? (Admin -> System Profile -> Download Profile)

I'd like to take a deeper look and see what is populating this.

EDIT: profile received.

Re: NSCA checks

Posted: Fri Mar 04, 2016 3:18 pm
by rkennedy
It is strange - I am seeing stalking_options n on all of your services.

Looking at your SQL log, I did notice -

Code: Select all

160302 12:29:04 [ERROR] mysqld: Table './nagiosql/tbl_variabledefinition' is marked as crashed and should be repaired
160302 12:29:04 [Warning] Checking table:   './nagiosql/tbl_variabledefinition'
Can you try running this to fix it, and let us know the output? -

Code: Select all

/usr/local/nagiosxi/scripts/repairmysql.sh nagios