I posted screen shots of the check being updated in Nagios and then changing status again a minute later...avandemore wrote: It means I have yet to see any evidence Nagios is not acting properly. You are saying the checks are gettting to Nagios. I'm asking you to demonstrate it. The only way I can replicate this behavior is to not have checks come in during the freshness threshold.
Passive check freshness check not working
- snapon_admin
- Posts: 952
- Joined: Mon Jun 10, 2013 10:39 am
- Location: Kenosha, WI
- Contact:
Re: Passive check freshness check not working
-
avandemore
- Posts: 1597
- Joined: Tue Sep 27, 2016 4:57 pm
Re: Passive check freshness check not working
So then the next step would be to post logs from the sending system during that time period showing a result was successfully sent.
You can also provide nsca's log with debug on(example of a working check):
You can also provide nsca's log with debug on(example of a working check):
Code: Select all
Dec 15 12:20:13 avandemore-centos7 xinetd[32502]: START: nsca pid=3574 from=::ffff:127.0.0.1
Dec 15 12:20:13 avandemore-centos7 nsca[3574]: Handling the connection...
Dec 15 12:20:13 avandemore-centos7 nsca[3574]: Time difference in packet: 0 seconds for host localhost
Dec 15 12:20:13 avandemore-centos7 nsca[3574]: SERVICE CHECK -> Host Name: 'localhost', Service Description: 'Passive Service', Return Code: '1', Output: 'Warning'
Dec 15 12:20:13 avandemore-centos7 nsca[3574]: Attempting to write to nagios command pipe
Dec 15 12:20:13 avandemore-centos7 nsca[3574]: End of connection...
Dec 15 12:20:13 avandemore-centos7 nagios: SERVICE ALERT: localhost;Passive Service;WARNING;HARD;1;Warning
Dec 15 12:20:13 avandemore-centos7 nagios: SERVICE NOTIFICATION: nagiosadmin;localhost;Passive Service;WARNING;xi_service_notification_handler;Warning
Dec 15 12:20:13 avandemore-centos7 xinetd[32502]: EXIT: nsca status=0 pid=3574 duration=0(sec)
Previous Nagios employee
- snapon_admin
- Posts: 952
- Joined: Mon Jun 10, 2013 10:39 am
- Location: Kenosha, WI
- Contact:
Re: Passive check freshness check not working
I will try to get that from my Unix admins. Not sure why it's needed since the logs from the Nagios server show that it's receiving passive checks but I'll see what they can do...
Re: Passive check freshness check not working
Could you post your status.dat file so we can view the settings?
It will have more details about when the check are running and the system status and that may help debug this issue better.
Code: Select all
/usr/local/nagios/var/status.datBe sure to check out our Knowledgebase for helpful articles and solutions!
- snapon_admin
- Posts: 952
- Joined: Mon Jun 10, 2013 10:39 am
- Location: Kenosha, WI
- Contact:
Re: Passive check freshness check not working
I can't seem to post it here or in a PM. I'm not getting any errors but I think your upload limit is like 2MB? This file is 19MB.
Re: Passive check freshness check not working
If you ZIP it, will it be small enough?
Be sure to check out our Knowledgebase for helpful articles and solutions!
- snapon_admin
- Posts: 952
- Joined: Mon Jun 10, 2013 10:39 am
- Location: Kenosha, WI
- Contact:
Re: Passive check freshness check not working
For some stupid reason I thought zipped files weren't allowed...derp. Attached.
You do not have the required permissions to view the files attached to this post.
-
avandemore
- Posts: 1597
- Joined: Tue Sep 27, 2016 4:57 pm
Re: Passive check freshness check not working
Can you cite this information? Up to this point, all we've confirmed is that the active checks run when the passive indicates failure per the freshness_threshold.snapon_admin wrote:Not sure why it's needed since the logs from the Nagios server show that it's receiving passive checks but I'll see what they can do...
Previous Nagios employee
- snapon_admin
- Posts: 952
- Joined: Mon Jun 10, 2013 10:39 am
- Location: Kenosha, WI
- Contact:
Re: Passive check freshness check not working
Maybe I'm mistaken but isn't this log showing that Nagios is receiving the passive check results and updating properly?snapon_admin wrote:Not sure if any of this helps:And when I run that other command it just gives the PID and says it's running.Code: Select all
Dec 9 10:20:57 lisl-ngos-01-pv nagios: SERVICE ALERT: lisdbqy02p on lisprod04g;GoldenGate Processes;WARNING;HARD;1;Warning - No Passive check results recieved in an hour. Please follow instructions in guide. Dec 9 10:20:58 lisl-ngos-01-pv nagios: HOST ALERT: SOCC-ROTR-MPLS;UP;HARD;1;OK - 10.93.255.1: rta 24.708ms, lost 0% Dec 9 10:20:59 lisl-ngos-01-pv nagios: HOST ALERT: ARAN-SGDC-01-PV;UP;HARD;1;OK - ARAN-SGDC-01-PV.snapon.com: rta 126.079ms, lost 0% Dec 9 10:20:59 lisl-ngos-01-pv nagios: HOST ALERT: KING-FRWL;UP;HARD;1;OK - 10.160.250.1: rta 104.487ms, lost 0% Dec 9 10:21:03 lisl-ngos-01-pv xinetd[4608]: START: nsca pid=20350 from=::ffff:10.245.64.33 Dec 9 10:21:03 lisl-ngos-01-pv nsca[20350]: Handling the connection... Dec 9 10:21:03 lisl-ngos-01-pv xinetd[4608]: START: nsca pid=20358 from=::ffff:10.245.64.45 Dec 9 10:21:03 lisl-ngos-01-pv nsca[20358]: Handling the connection... Dec 9 10:21:03 lisl-ngos-01-pv xinetd[4608]: START: nsca pid=20360 from=::ffff:10.245.64.49 Dec 9 10:21:03 lisl-ngos-01-pv xinetd[4608]: START: nsca pid=20361 from=::ffff:10.245.64.3 Dec 9 10:21:03 lisl-ngos-01-pv nsca[20360]: Handling the connection... Dec 9 10:21:03 lisl-ngos-01-pv nsca[20361]: Handling the connection... Dec 9 10:21:03 lisl-ngos-01-pv xinetd[4608]: START: nsca pid=20362 from=::ffff:10.245.64.21 Dec 9 10:21:03 lisl-ngos-01-pv nsca[20362]: Handling the connection... Dec 9 10:21:03 lisl-ngos-01-pv xinetd[4608]: START: nsca pid=20413 from=::ffff:10.245.64.16 Dec 9 10:21:03 lisl-ngos-01-pv nsca[20413]: Handling the connection... Dec 9 10:21:03 lisl-ngos-01-pv xinetd[4608]: START: nsca pid=20512 from=::ffff:10.245.64.37 Dec 9 10:21:03 lisl-ngos-01-pv nsca[20512]: Handling the connection... Dec 9 10:21:03 lisl-ngos-01-pv xinetd[4608]: START: nsca pid=20516 from=::ffff:10.245.64.38 Dec 9 10:21:03 lisl-ngos-01-pv nsca[20516]: Handling the connection... Dec 9 10:21:04 lisl-ngos-01-pv nsca[20350]: Time difference in packet: 0 seconds for host lishadb13p on lisprod02g Dec 9 10:21:04 lisl-ngos-01-pv nsca[20350]: SERVICE CHECK -> Host Name: 'lishadb13p on lisprod02g', Service Description: 'GoldenGate Processes', Return Code: '0', Output: 'GoldenGate process OK on pwmsdb13' Dec 9 10:21:04 lisl-ngos-01-pv nsca[20350]: Attempting to write to nagios command pipe Dec 9 10:21:04 lisl-ngos-01-pv nsca[20350]: End of connection... Dec 9 10:21:04 lisl-ngos-01-pv xinetd[4608]: EXIT: nsca status=0 pid=20350 duration=1(sec) Dec 9 10:21:04 lisl-ngos-01-pv nagios: SERVICE ALERT: lishadb13p on lisprod02g;GoldenGate Processes;OK;HARD;1;GoldenGate process OK on pwmsdb13 Dec 9 10:21:04 lisl-ngos-01-pv nsca[20358]: Time difference in packet: 0 seconds for host lisdbms13p on lisprod04g Dec 9 10:21:04 lisl-ngos-01-pv nsca[20358]: SERVICE CHECK -> Host Name: 'lisdbms13p on lisprod04g', Service Description: 'GoldenGate Processes', Return Code: '0', Output: 'GoldenGate process OK on pseodb01' Dec 9 10:21:04 lisl-ngos-01-pv nsca[20358]: Attempting to write to nagios command pipe Dec 9 10:21:04 lisl-ngos-01-pv nsca[20358]: End of connection... Dec 9 10:21:04 lisl-ngos-01-pv nagios: SERVICE ALERT: lisdbms13p on lisprod04g;GoldenGate Processes;OK;HARD;1;GoldenGate process OK on pseodb01 Dec 9 10:21:04 lisl-ngos-01-pv xinetd[4608]: EXIT: nsca status=0 pid=20358 duration=1(sec) Dec 9 10:21:04 lisl-ngos-01-pv nsca[20360]: Time difference in packet: 0 seconds for host lisdbqy13p on lisprod04g Dec 9 10:21:04 lisl-ngos-01-pv nsca[20360]: SERVICE CHECK -> Host Name: 'lisdbqy13p on lisprod04g', Service Description: 'GoldenGate Processes', Return Code: '0', Output: 'GoldenGate process OK on pwmsgg13' Dec 9 10:21:04 lisl-ngos-01-pv nsca[20360]: Attempting to write to nagios command pipe Dec 9 10:21:04 lisl-ngos-01-pv nsca[20360]: End of connection... Dec 9 10:21:04 lisl-ngos-01-pv nagios: SERVICE ALERT: lisdbqy13p on lisprod04g;GoldenGate Processes;OK;HARD;1;GoldenGate process OK on pwmsgg13 Dec 9 10:21:04 lisl-ngos-01-pv nsca[20361]: Time difference in packet: 0 seconds for host lisdbms14p on lisprod04g Dec 9 10:21:04 lisl-ngos-01-pv nsca[20361]: SERVICE CHECK -> Host Name: 'lisdbms14p on lisprod04g', Service Description: 'GoldenGate Processes', Return Code: '0', Output: 'GoldenGate process OK on pseogg01' Dec 9 10:21:04 lisl-ngos-01-pv nsca[20361]: Attempting to write to nagios command pipe Dec 9 10:21:04 lisl-ngos-01-pv nsca[20361]: End of connection... Dec 9 10:21:04 lisl-ngos-01-pv xinetd[4608]: EXIT: nsca status=0 pid=20360 duration=1(sec) Dec 9 10:21:04 lisl-ngos-01-pv nagios: SERVICE ALERT: lisdbms14p on lisprod04g;GoldenGate Processes;OK;HARD;1;GoldenGate process OK on pseogg01 Dec 9 10:21:04 lisl-ngos-01-pv xinetd[4608]: EXIT: nsca status=0 pid=20361 duration=1(sec) Dec 9 10:21:04 lisl-ngos-01-pv nsca[20362]: Time difference in packet: 0 seconds for host lisaerp01p on lisprod02g Dec 9 10:21:04 lisl-ngos-01-pv nsca[20362]: SERVICE CHECK -> Host Name: 'lisaerp01p on lisprod02g', Service Description: 'GoldenGate Processes', Return Code: '0', Output: 'GoldenGate process OK on perpdb01' Dec 9 10:21:04 lisl-ngos-01-pv nsca[20362]: Attempting to write to nagios command pipe
Re: Passive check freshness check not working
I looked in the status.dat file at this host "lisdbms14p on lisprod04g" for this service check "GoldenGate Processes" from one of your examples from your previous screenshot and it shows that active checks are enabled but your system profile says that active checks should be disabled and that is one of the causes of what you are seeing.
Can you login to the XI interface and go to the Home > Service Details menu, find the above service and open it up.
Then go to the Advanced Tab and verify that the Active Checks are disabled?
If it isn't, disable it and see if that makes that service check function like you need.
Can you login to the XI interface and go to the Home > Service Details menu, find the above service and open it up.
Then go to the Advanced Tab and verify that the Active Checks are disabled?
If it isn't, disable it and see if that makes that service check function like you need.
Be sure to check out our Knowledgebase for helpful articles and solutions!