Page 2 of 3
Re: Service check works in Core, but not XI
Posted: Tue Apr 29, 2014 5:41 pm
by abrist
Goofy behavior like this can sometimes be blamed on fork/orphan errors. Check the system messages for such issues:
tail -500 /var/log/messages | grep "fork\|orphan\|segfault"
Additionally, check the db for crashed tables:
Code: Select all
tail -100 /var/log/mysqld.log | grep "crashed"
Re: Service check works in Core, but not XI
Posted: Tue Apr 29, 2014 7:50 pm
by rseiwert
It looks like XI is truncating the string. It's too long. Remove some performance data and I bet it works.
Re: Service check works in Core, but not XI
Posted: Wed Apr 30, 2014 9:34 am
by lmiltchev
snapon_admin, can you give us an update on your issue? Did you check the system messages for fork/orphan errors as suggested by abrist?
Re: Service check works in Core, but not XI
Posted: Wed Apr 30, 2014 11:37 am
by snapon_admin
Neither of those spit out any results. I also tried deleting and re-adding the service from scratch, even though this check works on every other host, just to be sure. I even ticked the option to off for process perf data. There definitely would be more perf data on this host than on the others, but that also doesn't explain why the 2 host checks having similar issues aren't working.
Re: Service check works in Core, but not XI
Posted: Wed Apr 30, 2014 1:44 pm
by lmiltchev
I also tried deleting and re-adding the service from scratch, even though this check works on every other host, just to be sure.
Can you show us the definition of the service that is not working and one that is working? Hide sensitive info.
Re: Service check works in Core, but not XI
Posted: Wed Apr 30, 2014 2:15 pm
by snapon_admin
Working:
Code: Select all
define service {
host_name USSNAPMKEWI-Core
service_description Hardware health
check_command check_nwc_hardware_health!SNMPCOMMUNITYSTRING!!!!!!!
max_check_attempts 5
check_interval 10
retry_interval 2
active_checks_enabled 1
check_period xi_timeperiod_24x7
notification_period xi_timeperiod_24x7
notifications_enabled 0
contacts nagiosadmin
_xiwizard switch
register 1
}
Not working:
Code: Select all
define service {
host_name USSNAPLSAIL-Core
service_description Hardware health
check_command check_nwc_hardware_health!SNMPCOMMUNITYSTRING!!!!!!!
initial_state o
max_check_attempts 5
check_interval 10
retry_interval 2
active_checks_enabled 1
passive_checks_enabled 1
check_period xi_timeperiod_24x7
event_handler xi_service_notification_handler
event_handler_enabled 1
low_flap_threshold 10
high_flap_threshold 40
flap_detection_enabled 1
process_perf_data 0
notification_interval 60
notification_period xi_timeperiod_24x7
notification_options w,c,r,
notifications_enabled 0
contacts nagiosadmin
_xiwizard switch
register 1
}
Re: Service check works in Core, but not XI
Posted: Wed Apr 30, 2014 4:41 pm
by abrist
There are a number of things disabled on this object - performance data, notifications, etc. When you re-created this check did you do so through the copy mechanism, or did you re-create it by hand manually?
Re: Service check works in Core, but not XI
Posted: Thu May 01, 2014 10:41 am
by snapon_admin
I copied it from another service check using the check_nwc plugin, but I removed all templates and such and reconfigured pretty much every part of it. The reason that perfdata is disabled is that this particular check does have a ton of perfdata and I wanted to see if disabling it would correct the issue, which it didn't. I also disabled notifications in case something went wrong with adding the check, didn't want to confuse my ops team with a false alert.
Re: Service check works in Core, but not XI
Posted: Thu May 01, 2014 12:57 pm
by scottwilkerson
Actually if the check has a TON of perfdata, this could cause it to not make it in the DB depending on you mysql version (disabled or not).
Here would be a workaround, increase the perfdata size in the nagios DB
Code: Select all
echo "ALTER TABLE nagios_servicestatus MODIFY perfdata VARCHAR(65536);"|mysql -pnagiosxi nagios
echo "ALTER TABLE nagios_servicechecks MODIFY perfdata VARCHAR(65536);"|mysql -pnagiosxi nagios
echo "ALTER TABLE nagios_hoststatus MODIFY perfdata VARCHAR(65536);"|mysql -pnagiosxi nagios
echo "ALTER TABLE nagios_hostchecks MODIFY perfdata VARCHAR(65536);"|mysql -pnagiosxi nagios
Re: Service check works in Core, but not XI
Posted: Thu May 01, 2014 1:37 pm
by snapon_admin
Alright I'll try that. Once I've made that change should removing and re-adding the service check be all I need to do or is there any other steps I need to take?