Page 5 of 6
Re: Passive check freshness check not working
Posted: Thu Dec 15, 2016 2:21 am
by snapon_admin
avandemore wrote:
It means I have yet to see any evidence Nagios is not acting properly. You are saying the checks are gettting to Nagios. I'm asking you to demonstrate it. The only way I can replicate this behavior is to not have checks come in during the freshness threshold.
I posted screen shots of the check being updated in Nagios and then changing status again a minute later...
Re: Passive check freshness check not working
Posted: Thu Dec 15, 2016 12:51 pm
by avandemore
So then the next step would be to post logs from the sending system during that time period showing a result was successfully sent.
You can also provide nsca's log with debug on(example of a working check):
Code: Select all
Dec 15 12:20:13 avandemore-centos7 xinetd[32502]: START: nsca pid=3574 from=::ffff:127.0.0.1
Dec 15 12:20:13 avandemore-centos7 nsca[3574]: Handling the connection...
Dec 15 12:20:13 avandemore-centos7 nsca[3574]: Time difference in packet: 0 seconds for host localhost
Dec 15 12:20:13 avandemore-centos7 nsca[3574]: SERVICE CHECK -> Host Name: 'localhost', Service Description: 'Passive Service', Return Code: '1', Output: 'Warning'
Dec 15 12:20:13 avandemore-centos7 nsca[3574]: Attempting to write to nagios command pipe
Dec 15 12:20:13 avandemore-centos7 nsca[3574]: End of connection...
Dec 15 12:20:13 avandemore-centos7 nagios: SERVICE ALERT: localhost;Passive Service;WARNING;HARD;1;Warning
Dec 15 12:20:13 avandemore-centos7 nagios: SERVICE NOTIFICATION: nagiosadmin;localhost;Passive Service;WARNING;xi_service_notification_handler;Warning
Dec 15 12:20:13 avandemore-centos7 xinetd[32502]: EXIT: nsca status=0 pid=3574 duration=0(sec)
Re: Passive check freshness check not working
Posted: Thu Dec 15, 2016 2:43 pm
by snapon_admin
I will try to get that from my Unix admins. Not sure why it's needed since the logs from the Nagios server show that it's receiving passive checks but I'll see what they can do...
Re: Passive check freshness check not working
Posted: Thu Dec 15, 2016 4:33 pm
by tgriep
Could you post your status.dat file so we can view the settings?
It will have more details about when the check are running and the system status and that may help debug this issue better.
Re: Passive check freshness check not working
Posted: Fri Dec 16, 2016 10:46 am
by snapon_admin
I can't seem to post it here or in a PM. I'm not getting any errors but I think your upload limit is like 2MB? This file is 19MB.
Re: Passive check freshness check not working
Posted: Fri Dec 16, 2016 11:12 am
by tgriep
If you ZIP it, will it be small enough?
Re: Passive check freshness check not working
Posted: Mon Dec 19, 2016 10:35 am
by snapon_admin
For some stupid reason I thought zipped files weren't allowed...derp. Attached.
Re: Passive check freshness check not working
Posted: Mon Dec 19, 2016 12:48 pm
by avandemore
snapon_admin wrote:Not sure why it's needed since the logs from the Nagios server show that it's receiving passive checks but I'll see what they can do...
Can you cite this information? Up to this point, all we've confirmed is that the active checks run when the passive indicates failure per the freshness_threshold.
Re: Passive check freshness check not working
Posted: Mon Dec 19, 2016 1:28 pm
by snapon_admin
snapon_admin wrote:Not sure if any of this helps:
Code: Select all
Dec 9 10:20:57 lisl-ngos-01-pv nagios: SERVICE ALERT: lisdbqy02p on lisprod04g;GoldenGate Processes;WARNING;HARD;1;Warning - No Passive check results recieved in an hour. Please follow instructions in guide.
Dec 9 10:20:58 lisl-ngos-01-pv nagios: HOST ALERT: SOCC-ROTR-MPLS;UP;HARD;1;OK - 10.93.255.1: rta 24.708ms, lost 0%
Dec 9 10:20:59 lisl-ngos-01-pv nagios: HOST ALERT: ARAN-SGDC-01-PV;UP;HARD;1;OK - ARAN-SGDC-01-PV.snapon.com: rta 126.079ms, lost 0%
Dec 9 10:20:59 lisl-ngos-01-pv nagios: HOST ALERT: KING-FRWL;UP;HARD;1;OK - 10.160.250.1: rta 104.487ms, lost 0%
Dec 9 10:21:03 lisl-ngos-01-pv xinetd[4608]: START: nsca pid=20350 from=::ffff:10.245.64.33
Dec 9 10:21:03 lisl-ngos-01-pv nsca[20350]: Handling the connection...
Dec 9 10:21:03 lisl-ngos-01-pv xinetd[4608]: START: nsca pid=20358 from=::ffff:10.245.64.45
Dec 9 10:21:03 lisl-ngos-01-pv nsca[20358]: Handling the connection...
Dec 9 10:21:03 lisl-ngos-01-pv xinetd[4608]: START: nsca pid=20360 from=::ffff:10.245.64.49
Dec 9 10:21:03 lisl-ngos-01-pv xinetd[4608]: START: nsca pid=20361 from=::ffff:10.245.64.3
Dec 9 10:21:03 lisl-ngos-01-pv nsca[20360]: Handling the connection...
Dec 9 10:21:03 lisl-ngos-01-pv nsca[20361]: Handling the connection...
Dec 9 10:21:03 lisl-ngos-01-pv xinetd[4608]: START: nsca pid=20362 from=::ffff:10.245.64.21
Dec 9 10:21:03 lisl-ngos-01-pv nsca[20362]: Handling the connection...
Dec 9 10:21:03 lisl-ngos-01-pv xinetd[4608]: START: nsca pid=20413 from=::ffff:10.245.64.16
Dec 9 10:21:03 lisl-ngos-01-pv nsca[20413]: Handling the connection...
Dec 9 10:21:03 lisl-ngos-01-pv xinetd[4608]: START: nsca pid=20512 from=::ffff:10.245.64.37
Dec 9 10:21:03 lisl-ngos-01-pv nsca[20512]: Handling the connection...
Dec 9 10:21:03 lisl-ngos-01-pv xinetd[4608]: START: nsca pid=20516 from=::ffff:10.245.64.38
Dec 9 10:21:03 lisl-ngos-01-pv nsca[20516]: Handling the connection...
Dec 9 10:21:04 lisl-ngos-01-pv nsca[20350]: Time difference in packet: 0 seconds for host lishadb13p on lisprod02g
Dec 9 10:21:04 lisl-ngos-01-pv nsca[20350]: SERVICE CHECK -> Host Name: 'lishadb13p on lisprod02g', Service Description: 'GoldenGate Processes', Return Code: '0', Output: 'GoldenGate process OK on pwmsdb13'
Dec 9 10:21:04 lisl-ngos-01-pv nsca[20350]: Attempting to write to nagios command pipe
Dec 9 10:21:04 lisl-ngos-01-pv nsca[20350]: End of connection...
Dec 9 10:21:04 lisl-ngos-01-pv xinetd[4608]: EXIT: nsca status=0 pid=20350 duration=1(sec)
Dec 9 10:21:04 lisl-ngos-01-pv nagios: SERVICE ALERT: lishadb13p on lisprod02g;GoldenGate Processes;OK;HARD;1;GoldenGate process OK on pwmsdb13
Dec 9 10:21:04 lisl-ngos-01-pv nsca[20358]: Time difference in packet: 0 seconds for host lisdbms13p on lisprod04g
Dec 9 10:21:04 lisl-ngos-01-pv nsca[20358]: SERVICE CHECK -> Host Name: 'lisdbms13p on lisprod04g', Service Description: 'GoldenGate Processes', Return Code: '0', Output: 'GoldenGate process OK on pseodb01'
Dec 9 10:21:04 lisl-ngos-01-pv nsca[20358]: Attempting to write to nagios command pipe
Dec 9 10:21:04 lisl-ngos-01-pv nsca[20358]: End of connection...
Dec 9 10:21:04 lisl-ngos-01-pv nagios: SERVICE ALERT: lisdbms13p on lisprod04g;GoldenGate Processes;OK;HARD;1;GoldenGate process OK on pseodb01
Dec 9 10:21:04 lisl-ngos-01-pv xinetd[4608]: EXIT: nsca status=0 pid=20358 duration=1(sec)
Dec 9 10:21:04 lisl-ngos-01-pv nsca[20360]: Time difference in packet: 0 seconds for host lisdbqy13p on lisprod04g
Dec 9 10:21:04 lisl-ngos-01-pv nsca[20360]: SERVICE CHECK -> Host Name: 'lisdbqy13p on lisprod04g', Service Description: 'GoldenGate Processes', Return Code: '0', Output: 'GoldenGate process OK on pwmsgg13'
Dec 9 10:21:04 lisl-ngos-01-pv nsca[20360]: Attempting to write to nagios command pipe
Dec 9 10:21:04 lisl-ngos-01-pv nsca[20360]: End of connection...
Dec 9 10:21:04 lisl-ngos-01-pv nagios: SERVICE ALERT: lisdbqy13p on lisprod04g;GoldenGate Processes;OK;HARD;1;GoldenGate process OK on pwmsgg13
Dec 9 10:21:04 lisl-ngos-01-pv nsca[20361]: Time difference in packet: 0 seconds for host lisdbms14p on lisprod04g
Dec 9 10:21:04 lisl-ngos-01-pv nsca[20361]: SERVICE CHECK -> Host Name: 'lisdbms14p on lisprod04g', Service Description: 'GoldenGate Processes', Return Code: '0', Output: 'GoldenGate process OK on pseogg01'
Dec 9 10:21:04 lisl-ngos-01-pv nsca[20361]: Attempting to write to nagios command pipe
Dec 9 10:21:04 lisl-ngos-01-pv nsca[20361]: End of connection...
Dec 9 10:21:04 lisl-ngos-01-pv xinetd[4608]: EXIT: nsca status=0 pid=20360 duration=1(sec)
Dec 9 10:21:04 lisl-ngos-01-pv nagios: SERVICE ALERT: lisdbms14p on lisprod04g;GoldenGate Processes;OK;HARD;1;GoldenGate process OK on pseogg01
Dec 9 10:21:04 lisl-ngos-01-pv xinetd[4608]: EXIT: nsca status=0 pid=20361 duration=1(sec)
Dec 9 10:21:04 lisl-ngos-01-pv nsca[20362]: Time difference in packet: 0 seconds for host lisaerp01p on lisprod02g
Dec 9 10:21:04 lisl-ngos-01-pv nsca[20362]: SERVICE CHECK -> Host Name: 'lisaerp01p on lisprod02g', Service Description: 'GoldenGate Processes', Return Code: '0', Output: 'GoldenGate process OK on perpdb01'
Dec 9 10:21:04 lisl-ngos-01-pv nsca[20362]: Attempting to write to nagios command pipe
And when I run that other command it just gives the PID and says it's running.
Maybe I'm mistaken but isn't this log showing that Nagios is receiving the passive check results and updating properly?
Re: Passive check freshness check not working
Posted: Mon Dec 19, 2016 3:55 pm
by tgriep
I looked in the status.dat file at this host "lisdbms14p on lisprod04g" for this service check "GoldenGate Processes" from one of your examples from your previous screenshot and it shows that active checks are enabled but your system profile says that active checks should be disabled and that is one of the causes of what you are seeing.
Can you login to the XI interface and go to the Home > Service Details menu, find the above service and open it up.
Then go to the Advanced Tab and verify that the Active Checks are disabled?
If it isn't, disable it and see if that makes that service check function like you need.