Nagios stops working
Posted: Sun Sep 16, 2018 5:32 pm
Our Nagios XI instance stopped updating services and hosts status at 11:30 pm.
After a lot of troubleshooting we powered on a VM backup taken a few hours before at 8pm. It showed the same problem. No check executed. This was unexpected because at 8.30pm Nagios was working correctly. The monitoring engine was up but the event queue was empty. Then we reverted the date and time back to the 8:30pm and Nagios started to execute checks and update service status ! Then we set date and time to the current time and Nagios stopped working again.
Finally we found a similar case here https://support.nagios.com/forum/viewto ... =7&t=37028
We removed retention.dat, object.cache and object.precache files and started Nagios. Everything worked correctly since then on.
So the problem is probably solved but we would like to receive some advice about what maybe happened.
After a lot of troubleshooting we powered on a VM backup taken a few hours before at 8pm. It showed the same problem. No check executed. This was unexpected because at 8.30pm Nagios was working correctly. The monitoring engine was up but the event queue was empty. Then we reverted the date and time back to the 8:30pm and Nagios started to execute checks and update service status ! Then we set date and time to the current time and Nagios stopped working again.
Finally we found a similar case here https://support.nagios.com/forum/viewto ... =7&t=37028
We removed retention.dat, object.cache and object.precache files and started Nagios. Everything worked correctly since then on.
So the problem is probably solved but we would like to receive some advice about what maybe happened.