Problem (Backend: nagios): NDO claims that Nagios did not update for more than 180 seconds
Posted: Fri Dec 05, 2025 1:39 pm
Hello there,
we are experiencing a random, intermittent issue with the NDO backend in NagVis.
Software versions:
Nagios Core 4.5.3
Nagios XI 2024R1.2.2
NagVis 1.9.40b
All components (Nagios Core, Nagios XI, NagVis) are running in a PCS cluster with Pacemaker, Corosync and DRBD.
At random intervals every few minutes, NagVis reports the following error on multiple objects:
Problem (Backend: nagios): NDO claims that Nagios did not update for more than 180 seconds.
Visual behavior:
The NagVis map shows Summary State = ERROR (All object turn blue)
The summary output reports “Contains ERROR objects”
Many objects simultaneously switch to ERROR
The output column shows the same NDO 180-second timeout message
After a few minutes, the map returns automatically to OK
No Pacemaker failover occurs and all cluster resources remain Started.
We suspect a temporary interruption in the Nagios → NDO → DB data flow.
Any indication on common causes or recommended checks/tuning for this scenario in HA clustered environments would be appreciated.
Thank you.
we are experiencing a random, intermittent issue with the NDO backend in NagVis.
Software versions:
Nagios Core 4.5.3
Nagios XI 2024R1.2.2
NagVis 1.9.40b
All components (Nagios Core, Nagios XI, NagVis) are running in a PCS cluster with Pacemaker, Corosync and DRBD.
At random intervals every few minutes, NagVis reports the following error on multiple objects:
Problem (Backend: nagios): NDO claims that Nagios did not update for more than 180 seconds.
Visual behavior:
The NagVis map shows Summary State = ERROR (All object turn blue)
The summary output reports “Contains ERROR objects”
Many objects simultaneously switch to ERROR
The output column shows the same NDO 180-second timeout message
After a few minutes, the map returns automatically to OK
No Pacemaker failover occurs and all cluster resources remain Started.
We suspect a temporary interruption in the Nagios → NDO → DB data flow.
Any indication on common causes or recommended checks/tuning for this scenario in HA clustered environments would be appreciated.
Thank you.