Check Freshness running early

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
hbouma
Posts: 483
Joined: Tue Feb 27, 2018 9:31 am

Check Freshness running early

Post by hbouma »

I have several Nagios XI 5.6.1 servers running on RHEL 7 64bit VM's. We have several passive checks, all setup with a freshness value of 7200 and the check command is "check_dummy" with ARG1 as 0 "Resetting check after 2 hours".

Here are some examples from our nagios.log file (Service and HOST name redacted) As you can see, Services 1 and 4 were reset more often than the 2 hour freshness check. The passive process is a custom log scraping we wrote and does not include the ability to send the resets themselves.

[1569277851] SERVICE ALERT: HOST1;SERVICE_1;OK;HARD;1;OK: Resetting check after 2 hours
[1569278149] SERVICE ALERT: HOST1;SERVICE_1;OK;HARD;1;OK: Resetting check after 2 hours
[1569278448] SERVICE ALERT: HOST1;SERVICE_1;OK;HARD;1;OK: Resetting check after 2 hours
[1569278747] SERVICE ALERT: HOST1;SERVICE_1;OK;HARD;1;OK: Resetting check after 2 hours
[1569279045] SERVICE ALERT: HOST1;SERVICE_1;OK;HARD;1;OK: Resetting check after 2 hours
[1569279344] SERVICE ALERT: HOST1;SERVICE_1;OK;HARD;1;OK: Resetting check after 2 hours
[1569279642] SERVICE ALERT: HOST1;SERVICE_1;OK;HARD;1;OK: Resetting check after 2 hours
[1569279941] SERVICE ALERT: HOST1;SERVICE_1;OK;HARD;1;OK: Resetting check after 2 hours
[1569280240] SERVICE ALERT: HOST1;SERVICE_1;OK;HARD;1;OK: Resetting check after 2 hours
[1569281134] SERVICE ALERT: HOST1;SERVICE_1;OK;HARD;1;OK: Resetting check after 2 hours
[1569281433] SERVICE ALERT: HOST1;SERVICE_1;OK;HARD;1;OK: Resetting check after 2 hours
[1569281732] SERVICE ALERT: HOST1;SERVICE_1;OK;HARD;1;OK: Resetting check after 2 hours
[1569282030] SERVICE ALERT: HOST1;SERVICE_1;OK;HARD;1;OK: Resetting check after 2 hours
[1569282329] SERVICE ALERT: HOST1;SERVICE_1;OK;HARD;1;OK: Resetting check after 2 hours

[1569284459] SERVICE ALERT: HOST1;SERVICE_2;OK;HARD;1;OK: Resetting check after 2 hours
[1569286497] SERVICE ALERT: HOST1;SERVICE_3;OK;HARD;1;OK: Resetting check after 2 hours
[1569287213] SERVICE ALERT: HOST1;SERVICE_4;OK;HARD;1;OK: Resetting check after 2 hours
[1569289004] SERVICE ALERT: HOST1;SERVICE_4;OK;HARD;1;OK: Resetting check after 2 hours
[1569291693] SERVICE ALERT: HOST1;SERVICE_4;OK;HARD;1;OK: Resetting check after 2 hours
[1569293185] SERVICE ALERT: HOST1;SERVICE_4;OK;HARD;1;OK: Resetting check after 2 hours

[1569308553] SERVICE ALERT: HOST1;SERVICE_5;OK;HARD;1;OK: Resetting check after 2 hours
[1569323646] SERVICE ALERT: HOST1;SERVICE_4;OK;HARD;1;OK: Resetting check after 2 hours
[1569324123] SERVICE ALERT: HOST1;SERVICE_3;OK;HARD;1;OK: Resetting check after 2 hours

Here is the config for Service 4 (Host and Service name redacted)
Service 4.jpg
Any assistance figuring out why we are resetting the check so often would be appreciated.
You do not have the required permissions to view the files attached to this post.
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Check Freshness running early

Post by lmiltchev »

According to our documentation, under the "Check Settings" tab, you need to disable active checks, and enable passive checks for the service.

https://assets.nagios.com/downloads/nag ... ios-XI.pdf

From your screenshot, it is not really clear if the service is configured properly, as you selected the "Skip" option... and we haven't seen the template that is being in use.
example01.PNG
Can you verify that your active checks are disabled and passive checks enabled for your service? Also, make sure that your PHP and system time are not out of sync:

Admin > System Config > System Profile > View System Info > Date/Time

and that you don't have multiple nagios processes running on your server:

Code: Select all

ps -ef | grep nagios.cfg | grep -v grep
You do not have the required permissions to view the files attached to this post.
Be sure to check out our Knowledgebase for helpful articles and solutions!
hbouma
Posts: 483
Joined: Tue Feb 27, 2018 9:31 am

Re: Check Freshness running early

Post by hbouma »

The service was created using the passive check wizard in XI. Here is the configuraiton out of the config file:

Code: Select all

define service {
    host_name               XXXXXXXXXXXXXXXXXX
    service_description      XXXXXXXXXXXXXXXXXX
    use                      xiwizard_passive_service
    servicegroups            XXXXXXXXXXXXXXXX
    check_command            check_dummy!0 "Resetting check after 2 hours"!!!!!!!
    max_check_attempts       1
    check_period             xi_timeperiod_24x7
    check_freshness          1
    freshness_threshold      7200
    event_handler            XXXXXXXXXXXXXXXXXXXXXXX
    notification_interval    120
    notification_period      xi_timeperiod_24x7
    notifications_enabled    1
    contact_groups           XXXXXXXXXXXXXXXXXXXXXXXXX
    _xiwizard                passivecheck
    register                 1
}
I have checked the date and time:
PHP Time: Tue, 24 Sep 2019 15:39:06 -0400
System Time: Tue, 24 Sep 2019 15:39:06 -0400

It does appear that 2 copies of Nagios are running. When I run a systemctl restart nagios, it always starts a second copy. I have killed the pids and then ran the start, it is always running 2 copies. This appears to be the case for all 9 of my Nagios XI servers:

Code: Select all

$ ps -ef | grep nagios.cfg | grep -v grep
nagios   29289     1  0 09:22 ?        00:00:12 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios   29313 29289  0 09:22 ?        00:00:01 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
Active checks do show enabled, that, but they have no timeframe for when to run:
2019-09-24 15_45_42-Nagios XI.png
You do not have the required permissions to view the files attached to this post.
hbouma
Posts: 483
Joined: Tue Feb 27, 2018 9:31 am

Re: Check Freshness running early

Post by hbouma »

The odd thing is that it doesn't always reset after the same number of minutes. Here is the service history for one of the checks.
Report.png
You do not have the required permissions to view the files attached to this post.
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Check Freshness running early

Post by lmiltchev »

It does appear that 2 copies of Nagios are running.
Actually, this is normal... you don't have multiple nagios processes running. One is a child process - look at the PID:
example01.PNG
Having said that, it would be easier to troubleshoot the issue if we had some more information. Can you PM me (or any other member of the Nagios Support team) your latest profile (Admin > System Profile > Download Profile), the name of the service in question, and the host it is attached to? Thank you!
You do not have the required permissions to view the files attached to this post.
Be sure to check out our Knowledgebase for helpful articles and solutions!
hbouma
Posts: 483
Joined: Tue Feb 27, 2018 9:31 am

Re: Check Freshness running early

Post by hbouma »

Please provide the command line to build the profile. I get the following from the GUI:


PROFILE BUILD FAILED
Array
(
)
CODE: 1
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Check Freshness running early

Post by lmiltchev »

Be sure to check out our Knowledgebase for helpful articles and solutions!
hbouma
Posts: 483
Joined: Tue Feb 27, 2018 9:31 am

Re: Check Freshness running early

Post by hbouma »

PM Sent.
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Check Freshness running early

Post by lmiltchev »

Replied via PM.
Be sure to check out our Knowledgebase for helpful articles and solutions!
hbouma
Posts: 483
Joined: Tue Feb 27, 2018 9:31 am

Re: Check Freshness running early

Post by hbouma »

In case anyone else runs into the same issue, we found the problem and fix.

As part of our Maintenance procedure, we had been using the Nagios external commands to disable all checks on the server, then enable them when we were done (DISABLE_HOST_SVC_CHECKS and ENABLE_HOST_SVC_CHECKS). The enable portion of this caused the passive checks to enable their active checks, even if the configuration files had " active_checks_enabled 0".

This could be checked by going to the advanced section of the checks and seeing that the active checks were enabled. When we turned off active checks, everything worked properly again.
Locked