Passive check freshness check not working

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
User avatar
snapon_admin
Posts: 952
Joined: Mon Jun 10, 2013 10:39 am
Location: Kenosha, WI
Contact:

Re: Passive check freshness check not working

Post by snapon_admin »

K, just for some further visual reference this is what I see when the output is actually updated:
broken freshness2.png
And this is less than a minute later:
broken freshness3.png
The top 2 stay green because the configs on those are slightly different as I've been troubleshooting this. All the other checks have a check interval of 1 and check freshness is set to on.
You do not have the required permissions to view the files attached to this post.
avandemore
Posts: 1597
Joined: Tue Sep 27, 2016 4:57 pm

Re: Passive check freshness check not working

Post by avandemore »

I don't understand how this could have been working for you before.

Code: Select all

0 data packet(s) sent to host successfully.
Means exactly that. The format nsca will accept is very specific:
The format for a service check packet using NSCA is
<hostname>[tab]<svc_description>[tab]<return_code>[tab]<plugin_output>[newline].
So create a normal text file named test with the following:
localhost TestMessage 0 This is a test message
A properly formatted send_nsca will look like this:

Code: Select all

[root@avandemore-centos6 nagiosxi]# /usr/local/nagios/libexec/send_nsca -H localhost -c /usr/local/nagios/etc/send_nsca.cfg < /root/test
1 data packet(s) sent to host successfully.
Where:

Code: Select all

[root@avandemore-centos6 nagiosxi]# cat /root/test
localhost       Passive Service 2       test message
 
Results in this:
You do not have the required permissions to view the files attached to this post.
Previous Nagios employee
User avatar
snapon_admin
Posts: 952
Joined: Mon Jun 10, 2013 10:39 am
Location: Kenosha, WI
Contact:

Re: Passive check freshness check not working

Post by snapon_admin »

That was when my Unix admin just did a basic test. When we run the actual script that we're using packets ARE sent:

Code: Select all

+ /opt/csw/libexec/nagios-plugins/send_nsca -H 10.245.128.172 -p 5667 -d ';' -c /vendor/nagios/etc/opt/csw/send_nsca.cfg
+ echo 'lisdbms13p on lisprod04g;GoldenGate Processes;0;GoldenGate process OK on pseodb01'
1 data packet(s) sent to host successfully. 
The problem is that less than a minute later the status of the check changes again, either to the warning that is triggered by the freshness check or to the "no data received" default that new passive checks display. This was working fine for a year before the day it suddenly stopped.
avandemore
Posts: 1597
Joined: Tue Sep 27, 2016 4:57 pm

Re: Passive check freshness check not working

Post by avandemore »

It works correctly on my end using these settings:
passive1.png
passive2.png
What are the contents of the file $USER1$/freshness.sh which is used by the service?

Make sure whatever is supposed to be sending the data is actually sending the data. Normally this would be in the cron log for a UNIX type system.

Also note, there is a bit of delay in between the reception of a passive check to its display in XI or Core. This is usually in the range of a minute or 2.
You do not have the required permissions to view the files attached to this post.
Previous Nagios employee
User avatar
snapon_admin
Posts: 952
Joined: Mon Jun 10, 2013 10:39 am
Location: Kenosha, WI
Contact:

Re: Passive check freshness check not working

Post by snapon_admin »

Those are the settings I originally had. The check freshness.sh script is an extremely basic script, literally all it does is change the status to Warning and output the message for Operations to follow their instructions. The script isn't causing the issue, this happens even when that script is completely removed from the equation. And yes data is being sent, that's why the status changes occasionally. I just don't know why it keeps changing back less than a minute later.
avandemore
Posts: 1597
Joined: Tue Sep 27, 2016 4:57 pm

Re: Passive check freshness check not working

Post by avandemore »

What are the contents of the file $USER1$/freshness.sh which is used by the service?
Previous Nagios employee
User avatar
snapon_admin
Posts: 952
Joined: Mon Jun 10, 2013 10:39 am
Location: Kenosha, WI
Contact:

Re: Passive check freshness check not working

Post by snapon_admin »

Code: Select all

[root@lisl-ngos-01-pv etc]# cat /usr/local/nagios/libexec/freshness.sh
#!/bin/bash

echo "Warning - No Passive check results recieved in an hour. Please follow instructions in guide."

exit 1
avandemore
Posts: 1597
Joined: Tue Sep 27, 2016 4:57 pm

Re: Passive check freshness check not working

Post by avandemore »

Because we cannot replicate this issue here with these parameters:
  • the freshness threshold is honored as detailed here: Host and Service Freshness Checks
  • your check is actively indicating no check was received
  • there is incomplete understanding of when checks are being sent remotely
  • there has been no substantial changes to the code base in question for years
I suggest altering the sending system to log when traps are sent. I can assist with that but I need to how exactly its sent w.r.t to the remote side.
Previous Nagios employee
User avatar
snapon_admin
Posts: 952
Joined: Mon Jun 10, 2013 10:39 am
Location: Kenosha, WI
Contact:

Re: Passive check freshness check not working

Post by snapon_admin »

your check is actively indicating no check was received
It's also actively indicating that checks ARE received, so I'm not sure what this is supposed to mean. The issue isn't that the checks aren't being received, it's that the status changes immediately after the check is received.
there is incomplete understanding of when checks are being sent remotely
Every 15 minutes, as mentioned earlier. There is a job that is ran from a Control-M server that runs the script and this is ran every 15 minutes.
there has been no substantial changes to the code base in question for years
I understand that, that's why I posted here because this is a very confusing issue. No changes have been made to Nagios and no changes have been made to the host in question. It simply...stopped working one day.
avandemore
Posts: 1597
Joined: Tue Sep 27, 2016 4:57 pm

Re: Passive check freshness check not working

Post by avandemore »

snapon_admin wrote:It's also actively indicating that checks ARE received, so I'm not sure what this is supposed to mean.
It means I have yet to see any evidence Nagios is not acting properly. You are saying the checks are gettting to Nagios. I'm asking you to demonstrate it. The only way I can replicate this behavior is to not have checks come in during the freshness threshold.
snapon_admin wrote:There is a job that is ran from a Control-M server that runs the script and this is ran every 15 minutes.
Can you modify the pieces in question(on the remote system) to generate a /tmp/$TIMESTAMP each time it's run? Or perhaps you already have this information in the cron log. If so, please provide it.
Previous Nagios employee
Locked