Home » Categories » Multiple Categories

Nagios XI - Last Check Time Not Updating

Summary

There are many causes for this error.  Most often it is due to a connection issue to the backend historical database, crashed database tables, core scheduling/check execution issues, or lack of resources (causing orphaned checks).

 

More Details

The typical workflow can be explained as follows:

Nagios core schedules a check, and once the check is run the output is returned and the ndomod NEB module pushes the check result to the ndo2db daemon by placing it in the kernel message queue.  The ndo2db daemon then connects to the mysql "nagios" (ndoutils) historical database and inserts the check result.  The Nagios XI php scripts then query the "nagios" database for the status information to display on the frontend (in contrast to Nagios Core CGIs which query the status.dat file directly).

Thus, there are a number of things that can interfere with updating the "Last Check" time on the XI UI.

  1. The check is failing to be scheduled or executed.
  2. ndo2db is failing to insert the check result into the "nagios" mysql database or the Nagios XI frontend database query is failing.

 

Troubleshooting

The troubleshooting step is to verify if the checks are actually getting scheduled and executed.  If they are not, it is usually an issue with the Nagios Core engine.  If they are, it is most likely a database issue.

The easiest way to verify this is to check the Nagios Core web frontend to see if the "Last Check" time is updating.  Browse to:

http://<server_ip_or_hostname>/nagios/

Check any of the details for an object that is currently experiencing issues with "Last Check" times.  If the Core interface displays accurate "Last Check" times, proceed to Step 2 below.  If the Core interface is experiencing the same issues as the XI interface, follow Step 1 below.

 

1. The check is failing to be scheduled or executed

Issues with the Nagios Core auto-rescheduler directives:

There were a few bugs with the introduction of the auto_rescheduling feature in Nagios Core 4.0.8 (released 08/12/2014) which is used in Nagios XI 2014R1.4 (released 08/14/2014).  Those affected by this bug will notice the nagios.log file filled with errors pertaining to rescheduled checks. Originally, the new directives added to nagios.cfg could cause rescheduled checks to never execute, and instead be continuously rescheduled.  The original /usr/local/nagios/etc/nagios.cfg directives were:

auto_reschedule_checks=1
auto_rescheduling_interval=30
auto_rescheduling_window=180

Reducing the auto_rescheduling_window to "45", should resolve this issue:

auto_reschedule_checks=1
auto_rescheduling_interval=30
auto_rescheduling_window=45

Once the above changes are made to nagios.cfg, restart Nagios Core:

service nagios restart

 

Resource Issues forcing the rescheduling of checks:

If the system ulimit settings are too restrictive, checks may be orphaned and forced to reschedule.  Usually, this behavior is identified by checking the nagios.log file for lines similar to:

[1331905537] Warning: The check of service 'SERVICE' on host 'NAMESERVER' looks like it WAS orphaned (results never Came
back). I'm scheduling an immediate check of the service ... [1331755699] Warning: The check of service 'SWAP' on host 'nameserver'
not could be due to Performed to fork () error 'Resource temporarily unavailable'. The check will be rescheduled.

If many of those lines exist in nagios.log, perform the following tasks to increase the kernel ulimts:

Edit /etc/security/limits.conf:

#locked memory 
* hard memlock 128
* soft memlock 128

#open files
* soft nofile 4096
* hard nofile 4096

#max user processes
* hard nproc 4096
* soft nproc 4096

#stack size
* hard stack 20480
* soft stack 20480

And restart the server. Run

ulimit -a

after restarting to verify that the new settings are in place.

 

2. ndo2db is failing to insert the check result into the "nagios" mysql database.

There are crashed tables in the Nagios database:

Crashed tables can be identified by checking the mysql/mariadb logs located at:

/var/log/mysqld.log

or for mariadb:

/var/log/mariadb/

The relevant errors should resemble:

141127 10:40:24 [ERROR] /usr/libexec/mysqld: Table './nagios/nagios_logentries' is marked as crashed and last (automatic?) repair failed

 Repair the tables with:

cd /usr/local/nagiosxi/scripts/
./repair_databases.sh

 

Check For Multiple Nagios Processes

After following the steps above, make sure that multiple nagios processes are not running.

Execute this command to check:

ps -ef | grep nagios.cfg | grep -v grep

 

The following output is healthy:

nagios    5713     1  0 08:40 ?        00:00:00 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios    5723  5713  0 08:40 ?        00:00:00 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg

You can see the first line has a PID of 5713, this is the parent process.

The second line has the PID of 5723 however you can see that it references the parent PID of 5713, this is a child process of the parent and is normal behavior. On heavily-loaded systems you may see multiple child processes - this is normal behavior.

If your output has more than one parent process, execute the following commands:

service nagios stop
killall -9 nagios
service nagios start

 

Final Thoughts

For any support related questions please visit the Nagios Support Forums at:

http://support.nagios.com/forum/

0 (0)
Article Rating (No Votes)
Rate this article
  • Icon PDFExport to PDF
  • Icon MS-WordExport to MS Word
Attachments Attachments
There are no attachments for this article.
Related Articles RSS Feed
Nagios XI - Error while converting SVG
Viewed 242 times since Thu, Aug 3, 2017
Nagios XI - CentOS 6 Installation Problems XI 2011R1.7 2011R1.8
Viewed 1295 times since Tue, Feb 2, 2016
Active Directory / LDAP - Troubleshooting Authentication Integration
Viewed 803 times since Mon, Jun 26, 2017
XI 5.4 monitoring engine not running
Viewed 1573 times since Mon, Feb 6, 2017
Nagios XI - Modifying The Contents Of /usr/local/nagios/etc
Viewed 1701 times since Tue, Jan 26, 2016
NRPE - Agent and Plugin Explained
Viewed 1336 times since Fri, Jul 14, 2017
Nagios XI - Configuration Verification Failed
Viewed 4131 times since Mon, Jan 25, 2016
Pages Not Displaying Correctly
Viewed 2223 times since Mon, Jan 25, 2016
Nagios XI - Core Configuration Mananger Display Issues
Viewed 1114 times since Tue, Jan 26, 2016
Nagios XI - HTTP 500 Error / PHP Parse error - Unexpected $end
Viewed 1544 times since Mon, Jan 25, 2016