nagios restart affects nagvis
nagios restart affects nagvis
is there a way to restart/reload nagios configs without causing an impact to nagvis?
Re: nagios restart affects nagvis
Can you please describe in more detail the impact you are seeing?
Former Nagios employee
Re: nagios restart affects nagvis
Whenever you make changes to nagios and you have to "apply configuration" all of the nagvis dashboards go to error "unable to retrieve data" as nagios is reloading its configs. this takes quite some time in our env, up to 60-90 seconds to recover. These dashboards are highly visible, so we would like to do whatever we can to minimize their downtime.
Re: nagios restart affects nagvis
This is going to be somewhat unavoidable for the most part. NagVis isn't our code so I'm not 100% well-versed in how it operates behind the scenes, but it relies on the nagios process running and if it's down even for a restart this is going to cause some intermittent issues like you have described. Unfortunately this would need to be an edit to NagVis (possibly add in some caching) which would need to be done by the authors.
60-90 seconds does seem a bit high though - how many hosts+services do you have on this machine?
60-90 seconds does seem a bit high though - how many hosts+services do you have on this machine?
Former Nagios employee
Re: nagios restart affects nagvis
1250 hosts and approx 10000 services
- Box293
- Too Basu
- Posts: 5126
- Joined: Sun Feb 07, 2010 10:55 pm
- Location: Deniliquin, Australia
- Contact:
Re: nagios restart affects nagvis
Do you have a ramdisk enabled as per:
https://support.nagios.com/kb/article.php?id=288
https://support.nagios.com/kb/article.php?id=288
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: nagios restart affects nagvis
i wrote this basic script that seems to work with nagvis to avoid massive perfdata missing errors, my question here is, is there a way to still return a result to a check that restarts nagios, so that we know it completed successfully? it never gets the return 0 for success when it completes since this is after it runs the reconfigure script. (i have set this up as a check on the nagios server for other users to be able to run via the gui).
Code: Select all
#!/usr/bin/env ruby
require 'optparse'
require 'fileutils'
def self.parse(args)
options = {}
options[:port] = 5693
opt_parser = OptionParser.new do |opts|
opts.banner = 'Usage: createImportcfg.rb [options] -m -r --help'
# Mandatory arguments
opts.separator ' '
opts.separator 'Mandatory arguments:'
opts.on('-m', '--map-path', OptionParser::REQUIRED_ARGUMENT, "Path to NagVis maps '/usr/local/nagivs/etc/maps'") do |m|
options[:map_path] = m.to_s
end
opts.on('-r', '--reconfigure-nagios', OptionParser::REQUIRED_ARGUMENT, 'Path to reconfigure_nagios.sh script') do |r|
options[:reconfigure_script] = r.to_s
end
opts.on('--help') do
puts opts
exit
end
if ARGV.empty?
puts opts
exit
end
end
opt_parser.parse!(args)
options
end
def self.updateFiles(a, b)
Dir.open(Dir.pwd).each do |file|
next if File.directory? file
f_current = File.read(file)
f_updated = f_current.gsub(a, b)
File.open(file, 'w') do |f|
f.write(f_updated)
end
end
end
def self.maintenanceMode
options = parse(ARGV)
# Check if map location exists and put it in maintenanceMode
if File.directory?(options[:map_path]) && File.directory?(options[:reconfigure_script])
Dir.chdir(options[:map_path])
updateFiles("in_maintenance=0", "in_maintenance=1")
# Sleep for 15 seconds for maintenance popup to load
sleep(15)
begin
Dir.chdir(options[:reconfigure_script])
`#{options[:reconfigure_script]}/reconfigure_nagios.sh`
rescue
Dir.chdir(options[:map_path])
updateFiles("in_maintenance=1", "in_maintenance=0")
puts "Warning: Unable to execute #{options[:reconfigure_script]}/reconfigure_nagios.sh, rolling back maintenance."
return 1
end
# Sleep for 60 seconds for nagios to finish reconfiguration
sleep(60)
# Remove map from maintenanceMode
Dir.chdir(options[:map_path])
updateFiles("in_maintenance=1", "in_maintenance=0")
# Report success status
puts "SUCCESS: Reconfigured nagios and entered/exited maintenance mode successfully."
return 0
else
puts 'UNKNOWN: The paths passed as parameters do not exist, cannot reload nagios in maintenance mode.'
return 1
end
end
maintenanceMode
Last edited by tmcdonald on Fri May 06, 2016 9:18 am, edited 1 time in total.
Reason: Please use [code][/code] tags around code output
Reason: Please use [code][/code] tags around code output
Re: nagios restart affects nagvis
We're not really a Ruby shop so debugging that script isn't something we can really assist with. When you say you want to "return a result to a check that restarts nagios" what do you mean exactly? Plugins should never be restarting the nagios process.
Former Nagios employee
Re: nagios restart affects nagvis
It is a workaround to replace the "apply configuration" button in the application as there is not native support for nagvis maintenance mode that i have found.
Re: nagios restart affects nagvis
I'm wondering what you mean by "return a result to a check that restarts nagios" though. Are you trying to detect when the reconfigure_nagios.sh script finishes? Or are you looking to have a service you can run that will kick off your script? Something else?
Former Nagios employee