Page 1 of 2
nagios restart affects nagvis
Posted: Mon Apr 18, 2016 11:54 am
by doneil326
is there a way to restart/reload nagios configs without causing an impact to nagvis?
Re: nagios restart affects nagvis
Posted: Mon Apr 18, 2016 12:37 pm
by tmcdonald
Can you please describe in more detail the impact you are seeing?
Re: nagios restart affects nagvis
Posted: Mon Apr 18, 2016 1:52 pm
by doneil326
Whenever you make changes to nagios and you have to "apply configuration" all of the nagvis dashboards go to error "unable to retrieve data" as nagios is reloading its configs. this takes quite some time in our env, up to 60-90 seconds to recover. These dashboards are highly visible, so we would like to do whatever we can to minimize their downtime.
Re: nagios restart affects nagvis
Posted: Mon Apr 18, 2016 5:02 pm
by tmcdonald
This is going to be somewhat unavoidable for the most part. NagVis isn't our code so I'm not 100% well-versed in how it operates behind the scenes, but it relies on the nagios process running and if it's down even for a restart this is going to cause some intermittent issues like you have described. Unfortunately this would need to be an edit to NagVis (possibly add in some caching) which would need to be done by the authors.
60-90 seconds does seem a bit high though - how many hosts+services do you have on this machine?
Re: nagios restart affects nagvis
Posted: Mon Apr 18, 2016 5:14 pm
by doneil326
1250 hosts and approx 10000 services
Re: nagios restart affects nagvis
Posted: Mon Apr 18, 2016 10:08 pm
by Box293
Re: nagios restart affects nagvis
Posted: Thu May 05, 2016 12:50 pm
by doneil326
i wrote this basic script that seems to work with nagvis to avoid massive perfdata missing errors, my question here is, is there a way to still return a result to a check that restarts nagios, so that we know it completed successfully? it never gets the return 0 for success when it completes since this is after it runs the reconfigure script. (i have set this up as a check on the nagios server for other users to be able to run via the gui).
Code: Select all
#!/usr/bin/env ruby
require 'optparse'
require 'fileutils'
def self.parse(args)
options = {}
options[:port] = 5693
opt_parser = OptionParser.new do |opts|
opts.banner = 'Usage: createImportcfg.rb [options] -m -r --help'
# Mandatory arguments
opts.separator ' '
opts.separator 'Mandatory arguments:'
opts.on('-m', '--map-path', OptionParser::REQUIRED_ARGUMENT, "Path to NagVis maps '/usr/local/nagivs/etc/maps'") do |m|
options[:map_path] = m.to_s
end
opts.on('-r', '--reconfigure-nagios', OptionParser::REQUIRED_ARGUMENT, 'Path to reconfigure_nagios.sh script') do |r|
options[:reconfigure_script] = r.to_s
end
opts.on('--help') do
puts opts
exit
end
if ARGV.empty?
puts opts
exit
end
end
opt_parser.parse!(args)
options
end
def self.updateFiles(a, b)
Dir.open(Dir.pwd).each do |file|
next if File.directory? file
f_current = File.read(file)
f_updated = f_current.gsub(a, b)
File.open(file, 'w') do |f|
f.write(f_updated)
end
end
end
def self.maintenanceMode
options = parse(ARGV)
# Check if map location exists and put it in maintenanceMode
if File.directory?(options[:map_path]) && File.directory?(options[:reconfigure_script])
Dir.chdir(options[:map_path])
updateFiles("in_maintenance=0", "in_maintenance=1")
# Sleep for 15 seconds for maintenance popup to load
sleep(15)
begin
Dir.chdir(options[:reconfigure_script])
`#{options[:reconfigure_script]}/reconfigure_nagios.sh`
rescue
Dir.chdir(options[:map_path])
updateFiles("in_maintenance=1", "in_maintenance=0")
puts "Warning: Unable to execute #{options[:reconfigure_script]}/reconfigure_nagios.sh, rolling back maintenance."
return 1
end
# Sleep for 60 seconds for nagios to finish reconfiguration
sleep(60)
# Remove map from maintenanceMode
Dir.chdir(options[:map_path])
updateFiles("in_maintenance=1", "in_maintenance=0")
# Report success status
puts "SUCCESS: Reconfigured nagios and entered/exited maintenance mode successfully."
return 0
else
puts 'UNKNOWN: The paths passed as parameters do not exist, cannot reload nagios in maintenance mode.'
return 1
end
end
maintenanceMode
Re: nagios restart affects nagvis
Posted: Thu May 05, 2016 4:57 pm
by tmcdonald
We're not really a Ruby shop so debugging that script isn't something we can really assist with. When you say you want to "return a result to a check that restarts nagios" what do you mean exactly? Plugins should never be restarting the nagios process.
Re: nagios restart affects nagvis
Posted: Thu May 05, 2016 5:07 pm
by doneil326
It is a workaround to replace the "apply configuration" button in the application as there is not native support for nagvis maintenance mode that i have found.
Re: nagios restart affects nagvis
Posted: Fri May 06, 2016 9:22 am
by tmcdonald
I'm wondering what you mean by "return a result to a check that restarts nagios" though. Are you trying to detect when the reconfigure_nagios.sh script finishes? Or are you looking to have a service you can run that will kick off your script? Something else?