Page 1 of 1
Nagios core 4.4.10 status.dat and retention.dat missing
Posted: Mon Aug 14, 2023 5:48 am
by Butters
Hello guys, looking for help where the status.dat and retention.dat file are not being re-created after nagios service restart.
- when running pre-fly check with -v and nagios.cfg I get all ok no errors or warnings
- turned on debug mode and there are no warnings or errors in nagios.debug log file
- Nagios core was running absolutely fine without any issue but after a restart I'm seeing this issue
- permissions are ok on the path and as well on the files when I try to touch the files with nagios user
- tried to re-install nagios core to same and to newer version, still no luck (with prior cleanup of nagios path)
- when I restart the nagios service it does not return any error it restarts the service however it does not re-create the status.dat and retention.dat file
- the service restart itself take too long for sure compared to what it was before
- path for status.dat and retention.dat are configured ok in nagios.cfg
There are no perms issues, pre-fly check does not return any error or warning.Of course the gui returns Error: Could not read host and service status information! when there are these two dat files missing which are being read by cgi/php.
Thank you for any input.
Re: Nagios core 4.4.10 status.dat and retention.dat missing
Posted: Mon Aug 14, 2023 10:08 am
by swolf
Hi
@Butters, thanks for reaching out. Let's start by just looking at the retention.dat issue.
Since you have debugging turned on, can you post the part of the log starting
Code: Select all
xrddefault_save_state_information()
?
Also, is there anything in nagios.log referencing "retention"?
Lastly, I would make sure there's a temp_file directive in your nagios.cfg, and I'd verify whether there's a file with a name like nagios.tmpXXXXXX in that directory, and what the permissions are on that file. I could maybe see a situation where that temporary file gets incorrect permissions and then we fail to write to that file, stopping the retention/status data from being written altogether.
Let me know if that gets you anywhere interesting, and feel free to post any other comments/questions/concerns.
-Sebastian Wolf
Re: Nagios core 4.4.10 status.dat and retention.dat missing
Posted: Mon Aug 14, 2023 12:01 pm
by Butters
Identified the issue is with bug 861
https://github.com/NagiosEnterprises/na ... issues/861
don't want to say but seems like this bug fix was maybe missed in releases version 4.4.10???
strace shows
kill(15009, 0) = -1 ESRCH (No such process)
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f29b75d2a10) = 15318
close(3) = 0
close(3) = -1 EBADF (Bad file descriptor)
when a extra line added
check_for_updates=0 into nagios.cfg
then the nagios pid is able to generate status.dat and retention.dat files. Once again I\m running 4.4.10 which should include the previous bug fixes.
Re: Nagios core 4.4.10 status.dat and retention.dat missing
Posted: Tue Aug 15, 2023 11:11 am
by swolf
That's interesting. The original bug there is that Core actually SIGSEGV's when trying to check for updates. That issue is definitely fixed.
If check_for_updates=0 fixes your problem, then there's certainly some follow-up work I'll need to do. Is there any chance I could have you DM me the full strace output?
Re: Nagios core 4.4.10 status.dat and retention.dat missing
Posted: Wed Aug 16, 2023 12:32 am
by Butters
Hello Sebastian, shared the full strace over PM.
Re: Nagios core 4.4.10 status.dat and retention.dat missing
Posted: Wed Aug 16, 2023 10:36 am
by swolf
Hi
@Butters, acknowledging receipt of your strace log - thanks! I've also created an issue at
https://github.com/NagiosEnterprises/na ... issues/927 to help track this.
Re: Nagios core 4.4.10 status.dat and retention.dat missing
Posted: Thu Apr 18, 2024 8:46 pm
by florencepugh
I recommend using tools like `strace` or `lsof` to trace system calls or check for open files respectively during Nagios startup.
Retro Bowl
Re: Nagios core 4.4.10 status.dat and retention.dat missing
Posted: Tue Jun 04, 2024 8:47 pm
by otisjame
Butters wrote: ↑Mon Aug 14, 2023 12:01 pm
Identified the issue is with bug 861
https://github.com/NagiosEnterprises/na ... issues/861 fnf week 8
don't want to say but seems like this bug fix was maybe missed in releases version 4.4.10???
strace shows
kill(15009, 0) = -1 ESRCH (No such process)
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f29b75d2a10) = 15318
close(3) = 0
close(3) = -1 EBADF (Bad file descriptor)
when a extra line added
check_for_updates=0 into nagios.cfg
then the nagios pid is able to generate status.dat and retention.dat files. Once again I\m running 4.4.10 which should include the previous bug fixes.
I really want to know if you have resolved this issue yet?
Re: Nagios core 4.4.10 status.dat and retention.dat missing
Posted: Wed Jun 05, 2024 1:51 pm
by bbahn
Hello @otisjame,
It seems the underlying issue is not yet solved, but if you take a look at the issue, you can set check_for_updates=0 in your Nagios Core configuration file and the files will be created properly on restart of Nagios Core.