Not saving state retention data on shutdown or restart

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
thebream
Posts: 4
Joined: Tue Jan 28, 2020 3:50 pm
Location: Newcastle, NSW, Australia

Not saving state retention data on shutdown or restart

Post by thebream »

Nagios Core 4.3.4 on Raspbian GNU/Linux 10 (buster)

I have state retention enabled:
pi@tim-rpi3:/etc/nagios4 $ grep retention nagios.cfg | grep -v "^#"
state_retention_file=/var/lib/nagios4/retention.dat
retention_update_interval=60


And the state is saved every 60 minutes, as expected.

According to docs, it should also save before shutting down:
This is the file that Nagios will use for storing status, downtime, and comment information before it shuts down.
But on my system it is not saving on shutdown, so if I have a state that has changed since last auto-save - the state is incorrect after restart.

However, It does do a save when I reload service.

Example, stop exim4 service, restart nagios and check status:

Code: Select all

pi@tim-rpi3:~ $ date; sudo systemctl stop exim4
Wed 29 Jan 07:12:03 AEDT 2020

[rescheduled exim4 service check from web UI]
[status is critical, last check 2020-01-29 07:12:25]

pi@tim-rpi3:~ $ date; sudo ls -l /var/lib/nagios4/retention.dat
Wed 29 Jan 07:16:06 AEDT 2020
-rw------- 1 nagios nagios 42093 Jan 29 07:07 /var/lib/nagios4/retention.dat

pi@tim-rpi3:~ $ date; sudo systemctl stop nagios4
Wed 29 Jan 07:16:55 AEDT 2020

pi@tim-rpi3:~ $ date; sudo ls -l /var/lib/nagios4/retention.dat
Wed 29 Jan 07:18:14 AEDT 2020
-rw------- 1 nagios nagios 42093 Jan 29 07:07 /var/lib/nagios4/retention.dat
[has not been updated]

pi@tim-rpi3:~ $ date; sudo systemctl start nagios4
Wed 29 Jan 07:18:31 AEDT 2020

[service status reverted back to OK, last check 2020-01-29 06:29:13]

pi@tim-rpi3:~ $ date; sudo systemctl reload nagios4
Wed 29 Jan 07:21:07 AEDT 2020

pi@tim-rpi3:~ $ date; sudo ls -l /var/lib/nagios4/retention.dat
Wed 29 Jan 07:21:28 AEDT 2020
-rw------- 1 nagios nagios 42093 Jan 29 07:21 /var/lib/nagios4/retention.dat
[ has now been updated ]
Corresponding log file (timestamps converted to "nice" format):

Code: Select all

-> retention data is being saved every 60 minues, as per config
[Wed Jan 29 00:07:17 2020] Auto-save of retention data completed successfully.
[Wed Jan 29 01:07:17 2020] Auto-save of retention data completed successfully.
[Wed Jan 29 02:07:17 2020] Auto-save of retention data completed successfully.
[Wed Jan 29 03:07:17 2020] Auto-save of retention data completed successfully.
[Wed Jan 29 04:07:17 2020] Auto-save of retention data completed successfully.
[Wed Jan 29 05:07:17 2020] Auto-save of retention data completed successfully.
[Wed Jan 29 06:07:17 2020] Auto-save of retention data completed successfully.
[Wed Jan 29 07:07:17 2020] Auto-save of retention data completed successfully.

-> stop exim4 process and reschedule check
[Wed Jan 29 07:12:25 2020] EXTERNAL COMMAND: SCHEDULE_FORCED_SVC_CHECK;localhost;exim4 MTA;1580242342
[Wed Jan 29 07:12:25 2020] SERVICE ALERT: localhost;exim4 MTA;CRITICAL;SOFT;1;Active: inactive (dead) since Wed 2020-01-29 07:11:16 AEDT: 1min 9s ago
[Wed Jan 29 07:14:25 2020] SERVICE ALERT: localhost;exim4 MTA;CRITICAL;SOFT;2;Active: inactive (dead) since Wed 2020-01-29 07:11:16 AEDT: 3min 9s ago
[Wed Jan 29 07:16:25 2020] SERVICE ALERT: localhost;exim4 MTA;CRITICAL;HARD;3;Active: inactive (dead) since Wed 2020-01-29 07:11:16 AEDT: 5min ago
[Wed Jan 29 07:16:25 2020] SERVICE NOTIFICATION: emailtim;localhost;exim4 MTA;CRITICAL;mynotify-service-by-email;Active: inactive (dead) since Wed 2020-01-29 07:11:16 AEDT: 5min ago

-> stop, then start nagios4 service
[Wed Jan 29 07:17:11 2020] wproc: Socket to worker Core Worker 5861 broken, removing
[Wed Jan 29 07:18:31 2020] Nagios 4.3.4 starting... (PID=1624)
[Wed Jan 29 07:18:31 2020] Local time is Wed Jan 29 07:18:31 AEDT 2020
[Wed Jan 29 07:18:31 2020] LOG VERSION: 2.0
[Wed Jan 29 07:18:31 2020] qh: Socket '/var/lib/nagios4/rw/nagios.qh' successfully initialized
[Wed Jan 29 07:18:31 2020] qh: core query handler registered
[Wed Jan 29 07:18:31 2020] nerd: Channel hostchecks registered successfully
[Wed Jan 29 07:18:31 2020] nerd: Channel servicechecks registered successfully
[Wed Jan 29 07:18:31 2020] nerd: Channel opathchecks registered successfully
[Wed Jan 29 07:18:31 2020] nerd: Fully initialized and ready to rock!
[Wed Jan 29 07:18:31 2020] wproc: Successfully registered manager as @wproc with query handler
[Wed Jan 29 07:18:31 2020] wproc: Registry request: name=Core Worker 1625;pid=1625
[Wed Jan 29 07:18:31 2020] wproc: Registry request: name=Core Worker 1626;pid=1626
[Wed Jan 29 07:18:31 2020] wproc: Registry request: name=Core Worker 1629;pid=1629
[Wed Jan 29 07:18:31 2020] wproc: Registry request: name=Core Worker 1627;pid=1627
[Wed Jan 29 07:18:31 2020] wproc: Registry request: name=Core Worker 1628;pid=1628
[Wed Jan 29 07:18:31 2020] wproc: Registry request: name=Core Worker 1630;pid=1630
[Wed Jan 29 07:18:31 2020] Successfully launched command file worker with pid 1634

-> reload nagios4 service
[Wed Jan 29 07:21:07 2020] Caught SIGHUP, restarting...
[Wed Jan 29 07:21:07 2020] Event broker module 'NERD' deinitialized successfully.
[Wed Jan 29 07:21:07 2020] Nagios 4.3.4 starting... (PID=1800)
[Wed Jan 29 07:21:07 2020] Local time is Wed Jan 29 07:21:07 AEDT 2020
[Wed Jan 29 07:21:07 2020] LOG VERSION: 2.0
[Wed Jan 29 07:21:07 2020] qh: Socket '/var/lib/nagios4/rw/nagios.qh' successfully initialized
[Wed Jan 29 07:21:07 2020] qh: core query handler registered
[Wed Jan 29 07:21:07 2020] nerd: Channel hostchecks registered successfully
[Wed Jan 29 07:21:07 2020] nerd: Channel servicechecks registered successfully
[Wed Jan 29 07:21:07 2020] nerd: Channel opathchecks registered successfully
[Wed Jan 29 07:21:07 2020] nerd: Fully initialized and ready to rock!
[Wed Jan 29 07:21:07 2020] wproc: Successfully registered manager as @wproc with query handler
[Wed Jan 29 07:21:07 2020] wproc: Registry request: name=Core Worker 1864;pid=1864
[Wed Jan 29 07:21:07 2020] wproc: Registry request: name=Core Worker 1867;pid=1867
[Wed Jan 29 07:21:07 2020] wproc: Registry request: name=Core Worker 1865;pid=1865
[Wed Jan 29 07:21:07 2020] wproc: Registry request: name=Core Worker 1868;pid=1868
[Wed Jan 29 07:21:07 2020] wproc: Registry request: name=Core Worker 1866;pid=1866
[Wed Jan 29 07:21:07 2020] wproc: Registry request: name=Core Worker 1869;pid=1869
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Not saving state retention data on shutdown or restart

Post by Box293 »

We are going to need you to upgrade to the latest version of Nagios Core as 4.3.4 is over 2 years old. Once you've done this can you determine if this fixes your issue.

FYI this may be of use:
https://support.nagios.com/kb/article/n ... s-796.html

And so may this:
https://support.nagios.com/kb/article.php?id=797
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
thebream
Posts: 4
Joined: Tue Jan 28, 2020 3:50 pm
Location: Newcastle, NSW, Australia

Re: Not saving state retention data on shutdown or restart

Post by thebream »

Thanks for the suggestion.

I can confirm my problem was fixed by building Nagios Core 4.4.5 from source and installing.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Not saving state retention data on shutdown or restart

Post by scottwilkerson »

thebream wrote:Thanks for the suggestion.

I can confirm my problem was fixed by building Nagios Core 4.4.5 from source and installing.
Great!

Locking thread
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Locked