Home » Categories » Multiple Categories

Nagios XI - Last Check Time Not Updating

Summary

There are many causes for this error.  Most often it is due to a connection issue to the backend historical database, crashed database tables, core scheduling/check execution issues, or lack of resources (causing orphaned checks).

 

More Details

The typical workflow can be explained as follows:

Nagios core schedules a check, and once the check is run the output is returned and the ndomod NEB module pushes the check result to the ndo2db daemon by placing it in the kernel message queue.  The ndo2db daemon then connects to the mysql "nagios" (ndoutils) historical database and inserts the check result.  The Nagios XI php scripts then query the "nagios" database for the status information to display on the frontend (in contrast to Nagios Core CGIs which query the status.dat file directly).

Thus, there are a number of things that can interfere with updating the "Last Check" time on the XI UI.

  1. The check is failing to be scheduled or executed.
  2. ndo2db is failing to insert the check result into the "nagios" mysql database or the Nagios XI frontend database query is failing.

 

Troubleshooting

The troubleshooting step is to verify if the checks are actually getting scheduled and executed.  If they are not, it is usually an issue with the Nagios Core engine.  If they are, it is most likely a database issue.

The easiest way to verify this is to check the Nagios Core web frontend to see if the "Last Check" time is updating.  Browse to:

http://<server_ip_or_hostname>/nagios/

Check any of the details for an object that is currently experiencing issues with "Last Check" times.  If the Core interface displays accurate "Last Check" times, proceed to Step 2 below.  If the Core interface is experiencing the same issues as the XI interface, follow Step 1 below.

 

1. The check is failing to be scheduled or executed

Issues with the Nagios Core auto-rescheduler directives:

There were a few bugs with the introduction of the auto_rescheduling feature in Nagios Core 4.0.8 (released 08/12/2014) which is used in Nagios XI 2014R1.4 (released 08/14/2014).  Those affected by this bug will notice the nagios.log file filled with errors pertaining to rescheduled checks. Originally, the new directives added to nagios.cfg could cause rescheduled checks to never execute, and instead be continuously rescheduled.  The original /usr/local/nagios/etc/nagios.cfg directives were:

auto_reschedule_checks=1
auto_rescheduling_interval=30
auto_rescheduling_window=180

Reducing the auto_rescheduling_window to "45", should resolve this issue:

auto_reschedule_checks=1
auto_rescheduling_interval=30
auto_rescheduling_window=45

Once the above changes are made to nagios.cfg, restart Nagios Core:

service nagios restart

 

Resource Issues forcing the rescheduling of checks:

If the system ulimit settings are too restrictive, checks may be orphaned and forced to reschedule.  Usually, this behavior is identified by checking the nagios.log file for lines similar to:

[1331905537] Warning: The check of service 'SERVICE' on host 'NAMESERVER' looks like it WAS orphaned (results never Came
back). I'm scheduling an immediate check of the service ... [1331755699] Warning: The check of service 'SWAP' on host 'nameserver'
not could be due to Performed to fork () error 'Resource temporarily unavailable'. The check will be rescheduled.

If many of those lines exist in nagios.log, perform the following tasks to increase the kernel ulimts:

Edit /etc/security/limits.conf:

#locked memory 
* hard memlock 128
* soft memlock 128

#open files
* soft nofile 4096
* hard nofile 4096

#max user processes
* hard nproc 4096
* soft nproc 4096

#stack size
* hard stack 20480
* soft stack 20480

And restart the server. Run

ulimit -a

after restarting to verify that the new settings are in place.

 

2. ndo2db is failing to insert the check result into the "nagios" mysql database.

There are crashed tables in the Nagios database:

Crashed tables can be identified by checking the mysql/mariadb logs located at:

/var/log/mysqld.log

or for mariadb:

/var/log/mariadb/

The relevant errors should resemble:

141127 10:40:24 [ERROR] /usr/libexec/mysqld: Table './nagios/nagios_logentries' is marked as crashed and last (automatic?) repair failed

 Repair the tables with:

cd /usr/local/nagiosxi/scripts/
./repair_databases.sh

 

Check For Multiple Nagios Processes

After following the steps above, make sure that multiple nagios processes are not running.

Execute this command to check:

ps -ef | grep nagios.cfg | grep -v grep

 

The following output is healthy:

nagios    5713     1  0 08:40 ?        00:00:00 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios    5723  5713  0 08:40 ?        00:00:00 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg

You can see the first line has a PID of 5713, this is the parent process.

The second line has the PID of 5723 however you can see that it references the parent PID of 5713, this is a child process of the parent and is normal behavior. On heavily-loaded systems you may see multiple child processes - this is normal behavior.

If your output has more than one parent process, execute the following commands:

service nagios stop
killall -9 nagios
service nagios start

 

Final Thoughts

For any support related questions please visit the Nagios Support Forums at:

http://support.nagios.com/forum/

0 (0)
Article Rating (No Votes)
Rate this article
  • Icon PDFExport to PDF
  • Icon MS-WordExport to MS Word
Attachments Attachments
There are no attachments for this article.
Related Articles RSS Feed
Nagios XI - Error while converting SVG
Viewed 981 times since Thu, Aug 3, 2017
Nagios XI - How To Use CA Certificates With check_ldaps Plugin
Viewed 3409 times since Tue, Jul 26, 2016
Nagios XI - ERROR: unable to open include file: /etc/mrtg/conf.d/xxxxxxx.cfg
Viewed 3948 times since Wed, Jun 1, 2016
Nagios XI - Migrate Performance Data
Viewed 6463 times since Tue, Jan 26, 2016
Nagios XI - Host Still Visible After Deletion (Ghost Hosts)
Viewed 6256 times since Tue, Jan 27, 2015
Nagios XI - MSSQL Query Wizard - Invalid characters in the username
Viewed 1191 times since Thu, Aug 3, 2017
Nagios XI - Ajaxterm Installation Aborted
Viewed 1980 times since Tue, Jan 26, 2016
Nagios XI - How To Test Check Commands From The Command-line
Viewed 13625 times since Tue, Jan 26, 2016
Nagios XI - Unable to Delete Host
Viewed 7577 times since Tue, Dec 16, 2014
Nagios XI - MariaDB STRICT_TRANS_TABLES
Viewed 997 times since Thu, Nov 16, 2017