Getting check results for service are stale by

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Nabi
Posts: 18
Joined: Thu Apr 12, 2018 7:13 am

Getting check results for service are stale by

Post by Nabi »

Hello,

We have two NAGIOS servers ( have the same config parameters) that monitor many servers. However on the primary NAGIOS server we see "Getting check results for service are stale by xxx" just for some of the monitored servers, while on secondary there is no issue at all.

Example of the debug log...

[Tue Apr 3 09:59:59 2018.820447] [016.1] [pid=15267] Check results for service 'Check_cpu_host' on host 'xxx' are stale by 0d 0h 0m 58s (threshold=0d 0h 12m 0s). Forcing an immediate check of the service...


Could you please advise ...
:)

Thanks
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: Getting check results for service are stale by

Post by npolovenko »

Hello, @Nabi.
I would like to take a look at your system profile to tell whats going on.
To send us your system profile. Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file, upload it to a cloud storage of your choice and share a download link with me via private message.
After that please post something in this thread to bring it back up in the support queue.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Nabi
Posts: 18
Joined: Thu Apr 12, 2018 7:13 am

Re: Getting check results for service are stale by

Post by Nabi »

Hello,

I am not able to send you private message as it seems i am new in this forum. Is there other way please to share with you the files u asked me for..?

Thanks,

Nabi
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: Getting check results for service are stale by

Post by npolovenko »

@Nabi, I think you can send it right now since 2 posts are the requirement. You can also upload the file to the thread but keep in mind that other users will be able to see it as well.

A profile was received and shared with the support team.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Nabi
Posts: 18
Joined: Thu Apr 12, 2018 7:13 am

Re: Getting check results for service are stale by

Post by Nabi »

Thanks,
I sent you the files...

Thanks,

Nabi
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: Getting check results for service are stale by

Post by npolovenko »

@Nabi, Thank you. Please run the following commands:

Code: Select all

service nagios stop	
killall -9 nagios	
service nagios start
And then:

Code: Select all

/usr/local/nagiosxi/scripts/repair_databases.sh
Also, please upload the nagios.log file:

Code: Select all

/usr/local/nagios/var/nagios.log
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Nabi
Posts: 18
Joined: Thu Apr 12, 2018 7:13 am

Re: Getting check results for service are stale by

Post by Nabi »

just would like to mention again:

The issue of "passive check was not received" happens just on primary NAGIOS server and just on few devices, while the other devices are ok.
The Secondary NAGIOS does not show any issue for any device.


The issue happens on the primary NAGIOS server on random time. Moreover, the issue clears after some min..

So i am not sure if what you asked me for would be helpful here, or u want me to wait until the issue happens and send you some debug log file or so...

Example:

Primary_NAGIOS: zgrep -i USC/usr/local/nagiosxi/cpe_logs/uebNagios.log.20180418_235903.gz
2018-04-18 18:43:21 Nagios alarm for USC Check_passive_nagios CRITICAL
2018-04-18 18:45:22 Nagios alarm for USC Check_passive_nagios OK


Secondary NAGIOS: zgrep -i USC/usr/local/nagiosxi/cpe_logs/uebNagios.log.20180418*



=====================

Thanks,

Nabi
kyang

Re: Getting check results for service are stale by

Post by kyang »

There were some database errors in your XI, which why it was suggested before.

If you could run the commands that were provided and post the output. That would be great.

Along with the nagios.log file as an attachment. You can send it to either me or @npolovenko.

Also, what are you using to send passive checks? NCPA? NRDP? NSCA?
Nabi
Posts: 18
Joined: Thu Apr 12, 2018 7:13 am

Re: Getting check results for service are stale by

Post by Nabi »

Hello,

We are using NRDP to send the passive check.

I am not sure if there is other way than restarting NAGIOS on this server, because this is NAGIOS production server and we can not mess with it so much.

However, you say there is some database error. Can this error affect just few same devices that are sending the passive checks and leave the others not affected?

Also, do you see the same DB errors please on the Secondary NAGIOS server? because on the Secondary one there is no issue with passive check at all.


However, please let me know if u have other way than restating NAGIOS to debug this issue. If there is not then i will restart it.


Thanks,

Nabi
kyang

Re: Getting check results for service are stale by

Post by kyang »

Actually, from the profile_DR.zip there was no database log file in there.

Could you send us the database log? From the working server?

Also, the /usr/local/nagios/var/nagios.log from both servers.

You are using older versions of Nagios XI. Could you tell me what version of NRDP you are using?

Code: Select all

cat /usr/local/nrdp/server/config.inc.php | grep product_version
Locked