rkennedy wrote:For the record, which machine are we troubleshooting? It looks like you posted 3 separate nmap's.
i have proble with all 3 machine
rkennedy wrote:For the record, which machine are we troubleshooting? It looks like you posted 3 separate nmap's.
rkennedy wrote:Which one have we been working through on this thread? We do like to keep topics organized, to keep things from getting off track. With that, we'll need to work through one at a time.
Which one is onlinecard_cdc1?
Code: Select all
nmap 10.4.1.144 -p 12489
Starting Nmap 5.21 ( http://nmap.org ) at 2016-08-18 00:08 IRDT
mass_dns: warning: Unable to determine any DNS servers. Reverse DNS is disabled. Try using --system-dns or specify valid servers with --dns-servers
Nmap scan report for 10.4.1.144
Host is up (0.00085s latency).
PORT STATE SERVICE
12489/tcp filtered unknownso thanksbwallace wrote:... that last line - filtered means a firewall is blocking access or the port is simply closed. Since you've mentioned this occurs intermittently, could a FW be closing port 12489 periodically?Code: Select all
nmap 10.4.1.144 -p 12489 Starting Nmap 5.21 ( http://nmap.org ) at 2016-08-18 00:08 IRDT mass_dns: warning: Unable to determine any DNS servers. Reverse DNS is disabled. Try using --system-dns or specify valid servers with --dns-servers Nmap scan report for 10.4.1.144 Host is up (0.00085s latency). PORT STATE SERVICE 12489/tcp filtered unknown
I would run a tcpdump on the XI machine the next time this happens and time it so the tcpdump will capture one of these failed checks - this will confirm the theory above at least.
====================================================
TCPDUMP
*Have the Nagios XI UI up and ready*
SSH into your Nagios XI machine and start a tcpdump using this cmd:
tcpdump -s 0 -i any -w fileName.pcap
*If you get the error: "-bash: tcpdump: command not found" then install it with this cmd:
yum install tcpdump
+ once tcpdump is running go back to the Nagios UI and reproduce the issue
+ stop the tcpdump = Ctl+c
The .pcap file will be written to whatever directory you issued the tcpdump command from. You can use something like WinSCP to retrieve the pcap file.
=====================================================
Earlier in this thread another colleague of mentioned that something may be intermittently closing :12489, just as I mentioned in the previous post. I was pointing out the 'filtered' state of the nmap output on that one server since that is rather obvious. We could focus on that first then move on to the other two, we can't troubleshoot all three at once. But the other two may be having port 12489 closed intermittently or the servers themselves could be under high load at the time preventing them from responding within the given timeout value, hence our suggestions to increase it.Check_nt is NOT a good protocol and is considerd abandoneware. NSClient++ supports it only for legacy reasons. There is generally no reason to use check_nt
i have get tcpdump file and ha size is big 98Mb and when open that not readable what do i have to do ?bwallace wrote:Sorry I thought you were using XI, yes you can run a tcpdump on Core - same instructions can be used.
So far, we have confirmed the checks that are failing intermittently are check_nt where check _nrpe is problem free. At this point it is worth noting that according to the NSClient developer, check_nt is deprecated in favor of check_nrpe:Earlier in this thread another colleague of mentioned that something may be intermittently closing :12489, just as I mentioned in the previous post. I was pointing out the 'filtered' state of the nmap output on that one server since that is rather obvious. We could focus on that first then move on to the other two, we can't troubleshoot all three at once. But the other two may be having port 12489 closed intermittently or the servers themselves could be under high load at the time preventing them from responding within the given timeout value, hence our suggestions to increase it.Check_nt is NOT a good protocol and is considerd abandoneware. NSClient++ supports it only for legacy reasons. There is generally no reason to use check_nt