SNMP checks fails against Linux servers

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
vuservicedesk
Posts: 7
Joined: Tue Oct 11, 2016 5:41 am

Re: SNMP checks fails against Linux servers

Post by vuservicedesk »

You have multiple checks failing near the same times, and some aren't SNMP.
Can you be more specific?

netstat -s

Code: Select all

Ip:
    64013402 total packets received
    0 forwarded
    0 incoming packets discarded
    56775356 incoming packets delivered
    61586948 requests sent out
    16 outgoing packets dropped
Icmp:
    5117098 ICMP messages received
    8029 input ICMP message failed.
    ICMP input histogram:
        destination unreachable: 60670
        echo requests: 52877
        echo replies: 5003551
    5279284 ICMP messages sent
    0 ICMP messages failed
    ICMP output histogram:
        destination unreachable: 136168
        echo request: 5090239
        echo replies: 52877
IcmpMsg:
        InType0: 5003551
        InType3: 60670
        InType8: 52877
        OutType0: 52877
        OutType3: 136168
        OutType8: 5090239
Tcp:
    6477172 active connections openings
    1596268 passive connection openings
    86465 failed connection attempts
    2262456 connection resets received
    9 connections established
    46911867 segments received
    53733379 segments send out
    177556 segments retransmited
    80 bad segments received.
    4014282 resets sent
Udp:
    5934752 packets received
    289 packets to unknown port received.
    0 packet receive errors
    5939729 packets sent
    0 receive buffer errors
    0 send buffer errors
UdpLite:
TcpExt:
    10 invalid SYN cookies received
    1951197 TCP sockets finished time wait in fast timer
    3 packets rejects in established connections because of timestamp
    102360 delayed acks sent
    912 delayed acks further delayed because of locked socket
    Quick ack mode was activated 2783 times
    1053361 packets directly queued to recvmsg prequeue.
    98486932 bytes directly in process context from backlog
    719282776 bytes directly received in process context from prequeue
    10181236 packet headers predicted
    449493 packets header predicted and directly queued to user
    12001005 acknowledgments not containing data payload received
    2302164 predicted acknowledgments
    18 times recovered from packet loss by selective acknowledgements
    1 congestion windows recovered without slow start by DSACK
    1645 congestion windows recovered without slow start after partial ack
    TCPLostRetransmit: 2
    19 timeouts after SACK recovery
    41 fast retransmits
    5 forward retransmits
    21 retransmits in slow start
    157163 other TCP timeouts
    TCPLossProbes: 158209
    TCPLossProbeRecovery: 91262
    1 SACK retransmits failed
    2784 DSACKs sent for old packets
    25 DSACKs sent for out of order packets
    94285 DSACKs received
    2 DSACKs for out of order packets received
    2283690 connections reset due to unexpected data
    2433 connections reset due to early user close
    27 connections aborted due to timeout
    TCPDSACKIgnoredNoUndo: 89305
    TCPSpuriousRTOs: 426
    TCPSackShifted: 32
    TCPSackMerged: 59
    TCPSackShiftFallback: 473
    TCPDeferAcceptDrop: 1584856
    IPReversePathFilter: 2784098
    TCPRetransFail: 3
    TCPRcvCoalesce: 4920534
    TCPOFOQueue: 2206
    TCPOFOMerge: 25
    TCPChallengeACK: 198
    TCPSYNChallenge: 80
    TCPSpuriousRtxHostQueues: 13045
    TCPAutoCorking: 2010
    TCPSynRetrans: 17159
    TCPOrigDataSent: 20569997
    TCPHystartTrainDetect: 21164
    TCPHystartTrainCwnd: 421618
    TCPHystartDelayDetect: 56
    TCPHystartDelayCwnd: 2141
    TCPACKSkippedSynRecv: 111
    TCPACKSkippedSeq: 27
    TCPACKSkippedChallenge: 2
IpExt:
    InBcastPkts: 4126689
    InOctets: 14942249656
    OutOctets: 10905292194
    InBcastOctets: 417458165
    InNoECTPkts: 67382982
    InECT0Pkts: 129227

avandemore
Posts: 1597
Joined: Tue Sep 27, 2016 4:57 pm

Re: SNMP checks fails against Linux servers

Post by avandemore »

In the log you posted, there are a number of network related timeouts.
What is the output of

Code: Select all

netstat -i
Previous Nagios employee
vuservicedesk
Posts: 7
Joined: Tue Oct 11, 2016 5:41 am

Re: SNMP checks fails against Linux servers

Post by vuservicedesk »

netstat -i

Code: Select all

Kernel Interface table
Iface      MTU    RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
eno1      1500 51186235      0  24578 0      46512569      0      0      0 BMRU
eno2      1500 26839899      0  20712 0       9622354      0      0      0 BMRU
eno3      1500 15254238      0 4339584 0       4741001      0      0      0 BMRU
eno4      1500        0      0      0 0             0      0      0      0 BMU
eno49     1500   657436      0      0 0             0      0      0      0 BMRU
eno50     1500  8215666      0 4339586 0             0      0      0      0 BMRU
eno51     1500        0      0      0 0             0      0      0      0 BMU
eno52     1500   657436      0      0 0             0      0      0      0 BMRU
lo       65536  7091368      0      0 0       7091368      0      0      0 LRU

avandemore
Posts: 1597
Joined: Tue Sep 27, 2016 4:57 pm

Re: SNMP checks fails against Linux servers

Post by avandemore »

Your layer 2 info looks good meaning you have no real reason to suspect a bad switch or cabling.

Your layer 3 info is another story. Something is wrong here, although at this point it's not exactly clear what. My best guess would be you are simply overloading the system's network resources. I say this because there are clearly ongoing problems with both UDP and TCP connections. What's traffic look like on the XI's interfaces?

An easy way to set that up if you haven't done so is to turn on SNMP on the XI system, then run the Switch / Router wizard against it.

Something like this to turn it on for a CentOS system, adjust your needs and security preferences:
https://www.liquidweb.com/kb/how-to-ins ... on-centos/
Previous Nagios employee
Locked