Not saving state retention data on shutdown or restart
Posted: Tue Jan 28, 2020 4:21 pm
Nagios Core 4.3.4 on Raspbian GNU/Linux 10 (buster)
I have state retention enabled:
pi@tim-rpi3:/etc/nagios4 $ grep retention nagios.cfg | grep -v "^#"
state_retention_file=/var/lib/nagios4/retention.dat
retention_update_interval=60
And the state is saved every 60 minutes, as expected.
According to docs, it should also save before shutting down:
However, It does do a save when I reload service.
Example, stop exim4 service, restart nagios and check status:
Corresponding log file (timestamps converted to "nice" format):
I have state retention enabled:
pi@tim-rpi3:/etc/nagios4 $ grep retention nagios.cfg | grep -v "^#"
state_retention_file=/var/lib/nagios4/retention.dat
retention_update_interval=60
And the state is saved every 60 minutes, as expected.
According to docs, it should also save before shutting down:
But on my system it is not saving on shutdown, so if I have a state that has changed since last auto-save - the state is incorrect after restart.This is the file that Nagios will use for storing status, downtime, and comment information before it shuts down.
However, It does do a save when I reload service.
Example, stop exim4 service, restart nagios and check status:
Code: Select all
pi@tim-rpi3:~ $ date; sudo systemctl stop exim4
Wed 29 Jan 07:12:03 AEDT 2020
[rescheduled exim4 service check from web UI]
[status is critical, last check 2020-01-29 07:12:25]
pi@tim-rpi3:~ $ date; sudo ls -l /var/lib/nagios4/retention.dat
Wed 29 Jan 07:16:06 AEDT 2020
-rw------- 1 nagios nagios 42093 Jan 29 07:07 /var/lib/nagios4/retention.dat
pi@tim-rpi3:~ $ date; sudo systemctl stop nagios4
Wed 29 Jan 07:16:55 AEDT 2020
pi@tim-rpi3:~ $ date; sudo ls -l /var/lib/nagios4/retention.dat
Wed 29 Jan 07:18:14 AEDT 2020
-rw------- 1 nagios nagios 42093 Jan 29 07:07 /var/lib/nagios4/retention.dat
[has not been updated]
pi@tim-rpi3:~ $ date; sudo systemctl start nagios4
Wed 29 Jan 07:18:31 AEDT 2020
[service status reverted back to OK, last check 2020-01-29 06:29:13]
pi@tim-rpi3:~ $ date; sudo systemctl reload nagios4
Wed 29 Jan 07:21:07 AEDT 2020
pi@tim-rpi3:~ $ date; sudo ls -l /var/lib/nagios4/retention.dat
Wed 29 Jan 07:21:28 AEDT 2020
-rw------- 1 nagios nagios 42093 Jan 29 07:21 /var/lib/nagios4/retention.dat
[ has now been updated ]Code: Select all
-> retention data is being saved every 60 minues, as per config
[Wed Jan 29 00:07:17 2020] Auto-save of retention data completed successfully.
[Wed Jan 29 01:07:17 2020] Auto-save of retention data completed successfully.
[Wed Jan 29 02:07:17 2020] Auto-save of retention data completed successfully.
[Wed Jan 29 03:07:17 2020] Auto-save of retention data completed successfully.
[Wed Jan 29 04:07:17 2020] Auto-save of retention data completed successfully.
[Wed Jan 29 05:07:17 2020] Auto-save of retention data completed successfully.
[Wed Jan 29 06:07:17 2020] Auto-save of retention data completed successfully.
[Wed Jan 29 07:07:17 2020] Auto-save of retention data completed successfully.
-> stop exim4 process and reschedule check
[Wed Jan 29 07:12:25 2020] EXTERNAL COMMAND: SCHEDULE_FORCED_SVC_CHECK;localhost;exim4 MTA;1580242342
[Wed Jan 29 07:12:25 2020] SERVICE ALERT: localhost;exim4 MTA;CRITICAL;SOFT;1;Active: inactive (dead) since Wed 2020-01-29 07:11:16 AEDT: 1min 9s ago
[Wed Jan 29 07:14:25 2020] SERVICE ALERT: localhost;exim4 MTA;CRITICAL;SOFT;2;Active: inactive (dead) since Wed 2020-01-29 07:11:16 AEDT: 3min 9s ago
[Wed Jan 29 07:16:25 2020] SERVICE ALERT: localhost;exim4 MTA;CRITICAL;HARD;3;Active: inactive (dead) since Wed 2020-01-29 07:11:16 AEDT: 5min ago
[Wed Jan 29 07:16:25 2020] SERVICE NOTIFICATION: emailtim;localhost;exim4 MTA;CRITICAL;mynotify-service-by-email;Active: inactive (dead) since Wed 2020-01-29 07:11:16 AEDT: 5min ago
-> stop, then start nagios4 service
[Wed Jan 29 07:17:11 2020] wproc: Socket to worker Core Worker 5861 broken, removing
[Wed Jan 29 07:18:31 2020] Nagios 4.3.4 starting... (PID=1624)
[Wed Jan 29 07:18:31 2020] Local time is Wed Jan 29 07:18:31 AEDT 2020
[Wed Jan 29 07:18:31 2020] LOG VERSION: 2.0
[Wed Jan 29 07:18:31 2020] qh: Socket '/var/lib/nagios4/rw/nagios.qh' successfully initialized
[Wed Jan 29 07:18:31 2020] qh: core query handler registered
[Wed Jan 29 07:18:31 2020] nerd: Channel hostchecks registered successfully
[Wed Jan 29 07:18:31 2020] nerd: Channel servicechecks registered successfully
[Wed Jan 29 07:18:31 2020] nerd: Channel opathchecks registered successfully
[Wed Jan 29 07:18:31 2020] nerd: Fully initialized and ready to rock!
[Wed Jan 29 07:18:31 2020] wproc: Successfully registered manager as @wproc with query handler
[Wed Jan 29 07:18:31 2020] wproc: Registry request: name=Core Worker 1625;pid=1625
[Wed Jan 29 07:18:31 2020] wproc: Registry request: name=Core Worker 1626;pid=1626
[Wed Jan 29 07:18:31 2020] wproc: Registry request: name=Core Worker 1629;pid=1629
[Wed Jan 29 07:18:31 2020] wproc: Registry request: name=Core Worker 1627;pid=1627
[Wed Jan 29 07:18:31 2020] wproc: Registry request: name=Core Worker 1628;pid=1628
[Wed Jan 29 07:18:31 2020] wproc: Registry request: name=Core Worker 1630;pid=1630
[Wed Jan 29 07:18:31 2020] Successfully launched command file worker with pid 1634
-> reload nagios4 service
[Wed Jan 29 07:21:07 2020] Caught SIGHUP, restarting...
[Wed Jan 29 07:21:07 2020] Event broker module 'NERD' deinitialized successfully.
[Wed Jan 29 07:21:07 2020] Nagios 4.3.4 starting... (PID=1800)
[Wed Jan 29 07:21:07 2020] Local time is Wed Jan 29 07:21:07 AEDT 2020
[Wed Jan 29 07:21:07 2020] LOG VERSION: 2.0
[Wed Jan 29 07:21:07 2020] qh: Socket '/var/lib/nagios4/rw/nagios.qh' successfully initialized
[Wed Jan 29 07:21:07 2020] qh: core query handler registered
[Wed Jan 29 07:21:07 2020] nerd: Channel hostchecks registered successfully
[Wed Jan 29 07:21:07 2020] nerd: Channel servicechecks registered successfully
[Wed Jan 29 07:21:07 2020] nerd: Channel opathchecks registered successfully
[Wed Jan 29 07:21:07 2020] nerd: Fully initialized and ready to rock!
[Wed Jan 29 07:21:07 2020] wproc: Successfully registered manager as @wproc with query handler
[Wed Jan 29 07:21:07 2020] wproc: Registry request: name=Core Worker 1864;pid=1864
[Wed Jan 29 07:21:07 2020] wproc: Registry request: name=Core Worker 1867;pid=1867
[Wed Jan 29 07:21:07 2020] wproc: Registry request: name=Core Worker 1865;pid=1865
[Wed Jan 29 07:21:07 2020] wproc: Registry request: name=Core Worker 1868;pid=1868
[Wed Jan 29 07:21:07 2020] wproc: Registry request: name=Core Worker 1866;pid=1866
[Wed Jan 29 07:21:07 2020] wproc: Registry request: name=Core Worker 1869;pid=1869