Issues with Nagios Core Post-Power Outage (Durations '???', Host Status Incorrect, Suspected Corrupted Runtime Files)

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Post Reply
osegie.ly
Posts: 1
Joined: Thu Jun 19, 2025 1:17 pm

Issues with Nagios Core Post-Power Outage (Durations '???', Host Status Incorrect, Suspected Corrupted Runtime Files)

Post by osegie.ly »

Hi everyone,
I’ve been having serious issues with our Nagios Core setup and would really appreciate any help.
About a month ago, there was a power outage at the location hosting our core devices. After the power was restored, Nagios started behaving abnormally: The duration column began showing "???"
Host statuses were incorrect, some hosts would respond to a ping test, but Nagios would still show them as DOWN (and vice versa)
Availability reports were also incorrect. Oddly, service checks seemed mostly fine
We tried force re-scheduling checks for affected hosts. In some cases, that fixed the issue, but many durations and statuses remained wrong. So, we started deleting the configuration files for each problematic link and re-imported them via our NagiosQL admin interface. That seemed to help, the duration and host status were correct for those we updated.
However, we hadn’t finished this process for all hosts before another power outage occurred last week. After that second outage:
The same issues returned: wrong durations, incorrect host statuses, very strange values (e.g., 4020d...)in duration fields
Eventually, the Nagios web UI failed to load, so we had to reboot the core device
After rebooting, the UI came back up, but the problems persisted
Some durations now show numbers, but many are still "???" and still incorrect. Forcing re-scheduled checks don't update anything anymore
Right now, Nagios has become almost unusable for us.
I was able to access the Linux root console on the server, but I don’t have advanced Linux or MySQL knowledge, so I’m not sure how to proceed. I suspect something may be wrong with runtime/status files or retained state.
Also i think our back-up files seem corrupted too

Thank you for your help
Attachments
screeenshot of the service staus details on the web ui
screeenshot of the service staus details on the web ui
screeenshot of the host staus details on the web ui
screeenshot of the host staus details on the web ui
kg2857
Posts: 491
Joined: Wed Apr 12, 2023 5:48 pm

Re: Issues with Nagios Core Post-Power Outage (Durations '???', Host Status Incorrect, Suspected Corrupted Runtime Files

Post by kg2857 »

I'd be looking at the nagios logs, the server syslog, other system logs, the FS usage, and restart the server.
Post Reply