our Nagios Core installation seems to have problems with retaining custom variable values across daemon restart.
Our firewall SNMP interface indices are subject to change on active<->standby firewall state transitions. So we have created a script which fetches the indices from the firewall periodically and updates corresponding custom variables in Nagios accordingly. (Using the CHANGE_CUSTOM_HOST_VAR Nagios external command). The custom variable name is _wan_interface_index. Please see below relevant configuration snippets.
Nagios configuration:
Code: Select all
retain_state_information=1
state_retention_file=/var/lib/nagios/retention.dat
retention_update_interval=15
use_retained_program_state=1
retained_host_attribute_mask=0
retained_service_attribute_mask=0
retained_process_host_attribute_mask=0
retained_process_service_attribute_mask=0
object_cache_file=/var/lib/nagios/objects.cache
status_file=/var/lib/nagios/ramdisk/status.dat
status_update_interval=10
Code: Select all
#
# host templates
#
define host {
name firewall
register 0
_wan_interface_index TBD
}
define host {
name cisco-asa-5500-x
register 0
use firewall
[custom variable definitions omitted]
}
define host {
name asa-5515-x
register 0
use cisco-asa-5500-x
[custom variable definitions omitted]
}
#
# host definition
#
define host {
host_name asa-blava-fw1
use asa-5515-x
alias Cisco ASA 5515-X, ...
address <snip>
parents mpls-ce-blava
hostgroups asa,infrastructure,snmp-enabled,Blava
_connection_curr_thresholds -w 4000 -c 8000
_temp_value_chassis_thresholds -w50,50,50 -c60,60,60
_wan_interface_max_speed 100
_wan_interface_name outside
}
We have checked the contents of the retention.dat file right after stopping Nagios. Everything seems to be correct:
Code: Select all
host {
host_name=asa-blava-fw1
modified_attributes=32768
...
_WAN_INTERFACE_INDEX=1;3
}
Code: Select all
# grep 32768 nagios-4.4.5/include/common.h
#define MODATTR_CUSTOM_VARIABLE 32768
If I run Nagios under strace(1), I can see that the retention.dat file contents get mmap(2)ed correctly:
Code: Select all
open("/var/lib/nagios/retention.dat", O_RDONLY) = 12
fstat(12, {st_mode=S_IFREG|0600, st_size=2524802, ...}) = 0
mmap(NULL, 2524802, PROT_READ, MAP_PRIVATE, 12, 0) = 0x7fb573d7a000
munmap(0x7fb573d7a000, 2524802) = 0
close(12) = 0
One more thing to note is that in the object cache file (objects.cache in our case) the values of the custom variables never change from the initial "TBD". Not even after being updated by the external script:
Code: Select all
$ grep -isr _wan_interface_index *
objects.cache: _WAN_INTERFACE_INDEX TBD
objects.cache: _WAN_INTERFACE_INDEX TBD
objects.cache: _WAN_INTERFACE_INDEX TBD
objects.cache: _WAN_INTERFACE_INDEX TBD
objects.cache: _WAN_INTERFACE_INDEX TBD
objects.cache: _WAN_INTERFACE_INDEX TBD
objects.cache: _WAN_INTERFACE_INDEX TBD
ramdisk/status.dat: _WAN_INTERFACE_INDEX=1;3
ramdisk/status.dat: _WAN_INTERFACE_INDEX=1;3
ramdisk/status.dat: _WAN_INTERFACE_INDEX=1;2
ramdisk/status.dat: _WAN_INTERFACE_INDEX=1;17
ramdisk/status.dat: _WAN_INTERFACE_INDEX=0;TBD
ramdisk/status.dat: _WAN_INTERFACE_INDEX=1;10
ramdisk/status.dat: _WAN_INTERFACE_INDEX=0;TBD
retention.dat:_WAN_INTERFACE_INDEX=1;3
retention.dat:_WAN_INTERFACE_INDEX=1;3
retention.dat:_WAN_INTERFACE_INDEX=1;2
retention.dat:_WAN_INTERFACE_INDEX=1;17
retention.dat:_WAN_INTERFACE_INDEX=0;TBD
retention.dat:_WAN_INTERFACE_INDEX=1;10
retention.dat:_WAN_INTERFACE_INDEX=0;TBD
Rostislav