check_snmp_synology - False Positives

An open discussion forum for obtaining help with Nagios Core. Nagios Core users of all experience levels are welcome here. Subforum have been created for the discussion of Nagios Core and Nagios Plugin development.

NOTE: The SourceForge.net mailing lists have been deprecated in favor of this forum in order to expedite support and provide additional features not available on the old mailing list.

Re: check_snmp_synology - False Positives

Postby chris1337c » Thu Dec 27, 2018 5:40 pm

cdienger wrote:Just long enough to see a timeout message like this in the logs:

[12-26-2018 ********] SERVICE ALERT: DC_*****;Global Health Status;CRITICAL;SOFT;1;(Service check timed out after 180.04 seconds)

It should be a small file and I wouldn't be too concerned with it growing large. Feel free to stop and restart it though if the problem doesn't occur. You can also set up a rotating capture:

nohup tcpdump -Z root -s 0 -i any port 161 and host a.b.c.d -C 10 -W 5 -w output.pcap &

The above will start tcpdump and run it in the background - only storing the last 50 megs of data captured in 5 10 meg files(output.pcap0, output.pcap1, etc...). To stop the trace:

pkill tcpdump


Okay I ran the "nohup tcpdump -Z root -s 0 -i any port 161 and host a.b.c.d -C 10 -W 5 -w output.pcap &" command, the response I got was nohup:ignoring input and appending output to 'nohup.out'
chris1337c
 
Posts: 48
Joined: Wed Dec 26, 2018 2:31 pm

Re: check_snmp_synology - False Positives

Postby cdienger » Fri Dec 28, 2018 11:07 am

It would be in the location where the command was run. If you logged in as root and ran it without changing directories, it would likely be under /root. You can also run "pwd" on the command line and "ls" to get the present working directory and a listing of its contents.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
User avatar
cdienger
Support Tech
 
Posts: 2318
Joined: Tue Feb 07, 2017 11:26 am

Re: check_snmp_synology - False Positives

Postby cdienger » Fri Dec 28, 2018 11:11 am

The "nohup:ignoring input and appending output to 'nohup.out'" message is expected and you can just hit enter to get back to a command line and run "ps aux | grep tcpdump" to verify that it is running.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
User avatar
cdienger
Support Tech
 
Posts: 2318
Joined: Tue Feb 07, 2017 11:26 am

Re: check_snmp_synology - False Positives

Postby chris1337c » Fri Dec 28, 2018 11:24 am

Excellent, I will run this and start the capturing now! I didn't want to let it run without validating the output was okay from you. I tried googling it but didn't get a great understanding of what it meant.
chris1337c
 
Posts: 48
Joined: Wed Dec 26, 2018 2:31 pm

Re: check_snmp_synology - False Positives

Postby chris1337c » Fri Dec 28, 2018 11:26 am

I also commented out some useless aspects of the plugin like HD temp, DSM version, etc to try and make it as light as possible. Not sure how this will all work out but its progress.
chris1337c
 
Posts: 48
Joined: Wed Dec 26, 2018 2:31 pm

Re: check_snmp_synology - False Positives

Postby chris1337c » Fri Dec 28, 2018 11:35 am

(No output on stdout) stderr: /usr/local/nagios/libexec/check_snmp_synology: line 410: syntax error: unexpected end of file

Got an error, I removed the comment from the last "fi" in the file. Here is what I have currently:
Code: Select all
SNMPWALK=$(which snmpwalk)
SNMPGET=$(which snmpget)

SNMPVersion="3"
SNMPV2Community="public"
SNMPTimeout="15"
warningTemperature="50"
criticalTemperature="60"
warningStorage="80"
criticalStorage="95"
hostname=""
healthWarningStatus=0
healthCriticalStatus=0
healthString=""
verbose="no"
checkDSMUpdate="yes"
ups="no"

#OID declarations
OID_syno="1.3.6.1.4.1.6574"
OID_model="1.3.6.1.4.1.6574.1.5.1.0"
OID_serialNumber="1.3.6.1.4.1.6574.1.5.2.0"
OID_DSMVersion="1.3.6.1.4.1.6574.1.5.3.0"
OID_DSMUpgradeAvailable="1.3.6.1.4.1.6574.1.5.4.0"
OID_systemStatus="1.3.6.1.4.1.6574.1.1.0"
OID_temperature="1.3.6.1.4.1.6574.1.2.0"
OID_powerStatus="1.3.6.1.4.1.6574.1.3.0"
OID_systemFanStatus="1.3.6.1.4.1.6574.1.4.1.0"
OID_CPUFanStatus="1.3.6.1.4.1.6574.1.4.2.0"

OID_disk=""
OID_disk2=""
OID_diskID="1.3.6.1.4.1.6574.2.1.1.2"
OID_diskModel="1.3.6.1.4.1.6574.2.1.1.3"
OID_diskStatus="1.3.6.1.4.1.6574.2.1.1.5"
OID_diskTemp="1.3.6.1.4.1.6574.2.1.1.6"

OID_RAID=""
OID_RAIDName="1.3.6.1.4.1.6574.3.1.1.2"
OID_RAIDStatus="1.3.6.1.4.1.6574.3.1.1.3"

OID_Storage="1.3.6.1.2.1.25.2.3.1"
OID_StorageDesc="1.3.6.1.2.1.25.2.3.1.3"
OID_StorageAllocationUnits="1.3.6.1.2.1.25.2.3.1.4"
OID_StorageSize="1.3.6.1.2.1.25.2.3.1.5"
OID_StorageSizeUsed="1.3.6.1.2.1.25.2.3.1.6"

OID_UpsModel="1.3.6.1.4.1.6574.4.1.1.0"
OID_UpsSN="1.3.6.1.4.1.6574.4.1.3.0"
OID_UpsStatus="1.3.6.1.4.1.6574.4.2.1.0"
OID_UpsLoad="1.3.6.1.4.1.6574.4.2.12.1.0"
OID_UpsBatteryCharge="1.3.6.1.4.1.6574.4.3.1.1.0"
OID_UpsBatteryChargeWarning="1.3.6.1.4.1.6574.4.3.1.4.0"

usage()
{
        echo "usage: ./check_snmp_synology [OPTIONS] -u [user] -p [pass] -h [hostname]"
        echo "options:"
        echo "            -u [snmp username]      Username for SNMPv3"
        echo "            -p [snmp password]      Password for SNMPv3"
        echo ""
        echo "            -2 [community name]        Use SNMPv2 (no need user/password) & define community name (ex: public)"
        echo ""
        echo "            -h [hostname or IP](:port)   Hostname or IP. You can also define a different port"
        echo ""
        echo "            -W [warning temp]      Warning temperature (for disks & synology) (default $warningTemperature)"
        echo "            -C [critical temp]      Critical temperature (for disks & synology) (default $criticalTemperature)"
        echo ""
        echo "            -w [warning %]      Warning storage usage percentage (default $warningStorage)"
        echo "            -c [critical %]      Critical storage usage percentage (default $criticalStorage)"
        echo ""
        echo "            -i            Ignore DSM updates"
        echo "            -U            Show informations about the connected UPS (only information, no control)"
        echo "            -v            Verbose - print all informations about your Synology"
        echo ""
        echo ""
        echo "examples:   ./check_snmp_synology -u admin -p 1234 -h nas.intranet"   
        echo "           ./check_snmp_synology -u admin -p 1234 -h nas.intranet -v"   
        echo "      ./check_snmp_synology -2 public -h nas.intranet"   
        echo "      ./check_snmp_synology -2 public -h nas.intranet:10161"
        exit 3
}

if [ "$1" == "--help" ]; then
    usage; exit 0
fi

while getopts 2:W:C:w:c:u:p:h:iUv OPTNAME; do
        case "$OPTNAME" in
   u)   SNMPUser="$OPTARG";;
        p)   SNMPPassword="$OPTARG";;
        h)   hostname="$OPTARG";;
        v)   verbose="yes";;
   2)   SNMPVersion="2"
      SNMPV2Community="$OPTARG";;
   w)   warningStorage="$OPTARG";;
        c)      criticalStorage="$OPTARG";;
   W)   warningTemperature="$OPTARG";;
   C)   criticalTemperature="$OPTARG";;
   i)   checkDSMUpdate="no";;
   U)   ups="yes";;
        *)   usage;;
        esac
done

if [ "$warningTemperature" -gt "$criticalTemperature" ] ; then
    echo "Critical temperature must be higher than warning temperature"
    echo "Warning temperature: $warningTemperature"
    echo "Critical temperature: $criticalTemperature"
    echo ""
    echo "For more information:  ./${0##*/} --help"
    exit 1
fi

if [ "$warningStorage" -gt "$criticalStorage" ] ; then
    echo "The Critical storage usage percentage  must be higher than the warning storage usage percentage"
    echo "Warning: $warningStorage"
    echo "Critical: $criticalStorage"
    echo ""
    echo "For more information:  ./${0##*/} --help"
    exit 1
fi

if [ "$hostname" = "" ] || ([ "$SNMPVersion" = "3" ] && [ "$SNMPUser" = "" ]) || ([ "$SNMPVersion" = "3" ] && [ "$SNMPPassword" = "" ]) ; then
    usage
else
    if [ "$SNMPVersion" = "2" ] ; then
   SNMPArgs=" -OQne -v 2c -c $SNMPV2Community -t $SNMPTimeout"
    else
   SNMPArgs=" -OQne -v 3 -u $SNMPUser -A $SNMPPassword -l authNoPriv -a MD5 -t $SNMPTimeout"
   if [ ${#SNMPPassword} -lt "8" ] ; then
       echo "snmpwalk:  (The supplied password is too short.)"
       exit 1
   fi
    fi
    tmpRequest=`$SNMPWALK $SNMPArgs $hostname $OID_syno 2> /dev/null`
    if [ "$?" != "0" ] ; then
        echo "CRITICAL - Problem with SNMP request, check user/password/host"
        exit 2
    fi
    nbDisk=$(echo "$tmpRequest" | grep $OID_diskID | wc -l)
    nbRAID=$(echo "$tmpRequest" | grep $OID_RAIDName | wc -l)

    for i in `seq 1 $nbDisk`;
    do
   if [ $i -lt 25 ] ; then
       OID_disk="$OID_disk $OID_diskID.$(($i-1)) $OID_diskModel.$(($i-1)) $OID_diskStatus.$(($i-1)) $OID_diskTemp.$(($i-1)) "
   else
       OID_disk2="$OID_disk2 $OID_diskID.$(($i-1)) $OID_diskModel.$(($i-1)) $OID_diskStatus.$(($i-1)) $OID_diskTemp.$(($i-1)) "
   fi   
    done

    for i in `seq 1 $nbRAID`;
    do
      OID_RAID="$OID_RAID $OID_RAIDName.$(($i-1)) $OID_RAIDStatus.$(($i-1))"
    done

    if [ "$ups" = "yes" ] ; then
   syno=`$SNMPGET $SNMPArgs $hostname $OID_model $OID_serialNumber $OID_DSMVersion $OID_systemStatus $OID_temperature $OID_powerStatus $OID_systemFanStatus $OID_CPUFanStatus $OID_disk $OID_RAID $OID_DSMUpgradeAvailable $OID_UpsModel $OID_UpsSN $OID_UpsStatus $OID_UpsLoad $OID_UpsBatteryCharge $OID_UpsBatteryChargeWarning 2> /dev/null`
    else
   syno=`$SNMPGET $SNMPArgs $hostname $OID_model $OID_serialNumber $OID_DSMVersion $OID_systemStatus $OID_temperature $OID_powerStatus $OID_systemFanStatus $OID_CPUFanStatus $OID_disk $OID_RAID $OID_DSMUpgradeAvailable 2> /dev/null`
    fi

    if [ "$OID_disk2" != "" ]; then
   syno2=`$SNMPGET $SNMPArgs $hostname $OID_disk2 2> /dev/null`
   syno=$(echo "$syno";echo "$syno2";)
    fi

    model=$(echo "$syno" | grep $OID_model | cut -d "=" -f2)
    serialNumber=$(echo "$syno" | grep $OID_serialNumber | cut -d "=" -f2)
    DSMVersion=$(echo "$syno" | grep $OID_DSMVersion | cut -d "=" -f2)

    healthString="Synology $model (s/n:$serialNumber, $DSMVersion)"

    #DSMUpgradeAvailable=$(echo "$syno" | grep $OID_DSMUpgradeAvailable | cut -d "=" -f2 | sed 's/^[ \t]*//;s/[ \t]*$//')
    #case $DSMUpgradeAvailable in
   #"1")   DSMUpgradeAvailable="Available";   if [ "$checkDSMUpdate" = "yes" ]; then healthWarningStatus=1;      healthString="$healthString, DSM update available"; fi ;;
   #"2")   DSMUpgradeAvailable="Unavailable";;
   #"3")   DSMUpgradeAvailable="Connecting";;               
   #"4")   DSMUpgradeAvailable="Disconnected";   if [ "$checkDSMUpdate" = "yes" ]; then healthWarningStatus=1;      healthString="$healthString, DSM Update Disconnected"; fi ;;
   #"5")   DSMUpgradeAvailable="Others";      if [ "$checkDSMUpdate" = "yes" ]; then healthWarningStatus=1;      healthString="$healthString, Check DSM Update"; fi ;;
    #esac

    RAIDName=$(echo "$syno" | grep $OID_RAIDName | cut -d "=" -f2)
    RAIDStatus=$(echo "$syno" | grep $OID_RAIDStatus | cut -d "=" -f2 | sed 's/^[ \t]*//;s/[ \t]*$//')
    systemStatus=$(echo "$syno" | grep $OID_systemStatus | cut -d "=" -f2 | sed 's/^[ \t]*//;s/[ \t]*$//')
    #temperature=$(echo "$syno" | grep $OID_temperature | cut -d "=" -f2 | sed 's/^[ \t]*//;s/[ \t]*$//')
    powerStatus=$(echo "$syno" | grep $OID_powerStatus | cut -d "=" -f2 | sed 's/^[ \t]*//;s/[ \t]*$//')
    systemFanStatus=$(echo "$syno" | grep $OID_systemFanStatus | cut -d "=" -f2 | sed 's/^[ \t]*//;s/[ \t]*$//')
    CPUFanStatus=$(echo "$syno" | grep $OID_CPUFanStatus | cut -d "=" -f2 | sed 's/^[ \t]*//;s/[ \t]*$//')


    #Check system status
    if [ "$systemStatus" = "1" ] ; then
   systemStatus="Normal"
    else
   systemStatus="Failed"
        healthCriticalStatus=1
        healthString="$healthString, System status: $systemStatus "
    fi

    #Check system temperature
    #if [ "$temperature" -gt "$warningTemperature" ] ; then
       #if [ "$temperature" -gt "$criticalTemperature" ] ; then
           #temperature="$temperature (CRITICAL)"
          # healthCriticalStatus=1
          # healthString="$healthString, temperature: $temperature "
   #else
           #temperature="$temperature (WARNING)"
           #healthWarningStatus=1
          # healthString="$healthString, temperature: $temperature "
   #fi
    #else
   #temperature="$temperature (Normal)"
   # fi


    #Check power status
    if [ "$powerStatus" = "1" ] ; then
        powerStatus="Normal"
    else
          powerStatus="Failed";
        healthCriticalStatus=1
        healthString="$healthString, Power status: $powerStatus "
    fi


    #Check system fan status
    if [ "$systemFanStatus" = "1" ] ; then
        systemFanStatus="Normal"
    else
        systemFanStatus="Failed";      
        healthCriticalStatus=1
        healthString="$healthString, System fan status: $systemFanStatus "
    fi
   

    #Check CPU fan status
    if [ "$CPUFanStatus" = "1" ] ; then
   CPUFanStatus="Normal"
    else
        CPUFanStatus="Failed";      
        healthCriticalStatus=1
        healthString="$healthString, CPU fan status: $CPUFanStatus "
    fi


    #Check all disk status
    for i in `seq 1 $nbDisk`;
    do
       diskID[$i]=$(echo "$syno" | grep "$OID_diskID.$(($i-1)) " | cut -d "=" -f2)
       diskModel[$i]=$(echo "$syno" | grep "$OID_diskModel.$(($i-1)) " | cut -d "=" -f2 )
       diskStatus[$i]=$(echo "$syno" | grep "$OID_diskStatus.$(($i-1)) " | cut -d "=" -f2 | sed 's/^[ \t]*//;s/[ \t]*$//')
       diskTemp[$i]=$(echo "$syno" | grep "$OID_diskTemp.$(($i-1)) " | cut -d "=" -f2 | sed 's/^[ \t]*//;s/[ \t]*$//')

   case ${diskStatus[$i]} in
      "1")   diskStatus[$i]="Normal";      ;;
      "2")   diskStatus[$i]="Initialized";      ;;
      "3")   diskStatus[$i]="NotInitialized";   ;;
      "4")   diskStatus[$i]="SystemPartitionFailed";   healthCriticalStatus=1; healthString="$healthString, problem with ${diskID[$i]} (model:${diskModel[$i]}) status:${diskStatus[$i]} temperature:${diskTemp[$i]}";;
      "5")   diskStatus[$i]="Crashed";      healthCriticalStatus=1;   healthString="$healthString, problem with ${diskID[$i]} (model:${diskModel[$i]}) status:${diskStatus[$i]} temperature:${diskTemp[$i]}";;
   esac

   if [ "${diskTemp[$i]}" -gt "$warningTemperature" ] ; then
       if [ "${diskTemp[$i]}" -gt "$criticalTemperature" ] ; then
      diskTemp[$i]="${diskTemp[$i]} (CRITICAL)"
      healthCriticalStatus=1;
      healthString="$healthString, ${diskID[$i]} temperature: ${diskTemp[$i]}"
       else
      diskTemp[$i]="${diskTemp[$i]} (WARNING)"
      healthWarningStatus=1;
      healthString="$healthString, ${diskID[$i]} temperature: ${diskTemp[$i]}"
       fi
   fi

     done 

    syno_diskspace=`$SNMPWALK $SNMPArgs $hostname $OID_Storage 2> /dev/null`

    #Check all RAID volume status
    for i in `seq 1 $nbRAID`;
    do
   RAIDName[$i]=$(echo "$syno" | grep $OID_RAIDName.$(($i-1)) | cut -d "=" -f2)
   RAIDStatus[$i]=$(echo "$syno" | grep $OID_RAIDStatus.$(($i-1)) | cut -d "=" -f2 | sed 's/^[ \t]*//;s/[ \t]*$//')

   storageName[$i]=$(echo "${RAIDName[$i]}" | sed -e 's/[[:blank:]]//g' | sed -e 's/\"//g' | sed 's/.*/\L&/')

   # modified by Tobias Schenke
   # "timebackup" (when backup-job runs) and the "docker-feature" (since dsm 6.0, and if installed) mount volumes as a substructure of /"volume1/..." or "/.../volume1/..."
   # in this case the former grep failed with more then one result.
   # modified script to look for a line with '= "/volume1"' instead of 'volume1'
   #storageID[$i]=$(echo "$syno_diskspace" | grep ${storageName[$i]} | cut -d "=" -f1 | rev | cut -d "." -f1 | rev)
   storageID[$i]=$(echo "$syno_diskspace" | grep "= \"\?/${storageName[$i]}\"\?" | cut -d "=" -f1 | rev | cut -d "." -f1 | rev)

   if [ "${storageID[$i]}" != "" ] ; then
       storageSize[$i]=$(echo "$syno_diskspace" | grep "$OID_StorageSize.${storageID[$i]}" | cut -d "=" -f2 )
       storageSizeUsed[$i]=$(echo "$syno_diskspace" | grep "$OID_StorageSizeUsed.${storageID[$i]}" | cut -d "=" -f2 )
       storageAllocationUnits[$i]=$(echo "$syno_diskspace" | grep "$OID_StorageAllocationUnits.${storageID[$i]}" | cut -d "=" -f2 )
       storagePercentUsed[$i]=$((${storageSizeUsed[$i]} * 100 / ${storageSize[$i]}))
       storagePercentUsedString[$i]="${storagePercentUsed[$i]}% used"

       if [ "${storagePercentUsed[$i]}" -gt "$warningStorage" ] ; then
          if [ "${storagePercentUsed[$i]}" -gt "$criticalStorage" ] ; then
                    healthCriticalStatus=1;
          storagePercentUsedString[$i]="${storagePercentUsedString[$i]} CRITICAL"
                    healthString="$healthString,${RAIDName[$i]}: ${storagePercentUsedString[$i]}"
               else
                    healthWarningStatus=1;
          storagePercentUsedString[$i]="${storagePercentUsedString[$i]} WARNING"
                    healthString="$healthString,${RAIDName[$i]}: ${storagePercentUsedString[$i]}"
               fi
            fi
   fi

        case ${RAIDStatus[$i]} in
      "1")   RAIDStatus[$i]="Normal";      ;;
      "2")   RAIDStatus[$i]="Repairing";      healthWarningStatus=1;      healthString="$healthString, RAID status (${RAIDName[$i]}): ${RAIDStatus[$i]} ";;
      "3")   RAIDStatus[$i]="Migrating";      healthWarningStatus=1;      healthString="$healthString, RAID status (${RAIDName[$i]}): ${RAIDStatus[$i]} ";;
      "4")   RAIDStatus[$i]="Expanding";      healthWarningStatus=1;      healthString="$healthString, RAID status (${RAIDName[$i]}): ${RAIDStatus[$i]} ";;
      "5")   RAIDStatus[$i]="Deleting";      healthWarningStatus=1;      healthString="$healthString, RAID status (${RAIDName[$i]}): ${RAIDStatus[$i]} ";;
      "6")   RAIDStatus[$i]="Creating";      healthWarningStatus=1;      healthString="$healthString, RAID status (${RAIDName[$i]}): ${RAIDStatus[$i]} ";;
      "7")   RAIDStatus[$i]="RaidSyncing";      healthWarningStatus=1;      healthString="$healthString, RAID status (${RAIDName[$i]}): ${RAIDStatus[$i]} ";;
      "8")   RAIDStatus[$i]="RaidParityChecking";   healthWarningStatus=1;      healthString="$healthString, RAID status (${RAIDName[$i]}): ${RAIDStatus[$i]} ";;
      "9")   RAIDStatus[$i]="RaidAssembling";   healthWarningStatus=1;      healthString="$healthString, RAID status (${RAIDName[$i]}): ${RAIDStatus[$i]} ";;
      "10")   RAIDStatus[$i]="Canceling";      healthWarningStatus=1;      healthString="$healthString, RAID status (${RAIDName[$i]}): ${RAIDStatus[$i]} ";;
      "11")   RAIDStatus[$i]="Degrade";      healthCriticalStatus=1;      healthString="$healthString, RAID status (${RAIDName[$i]}): ${RAIDStatus[$i]} ";;
      "12")   RAIDStatus[$i]="Crashed";      healthCriticalStatus=1;      healthString="$healthString, RAID status (${RAIDName[$i]}): ${RAIDStatus[$i]} ";;
        esac
    done

    if [ "$verbose" = "yes" ] ; then   
   echo "Synology model:      $model" ;
   echo "Synology s/n:      $serialNumber" ;
   echo "DSM Version:      $DSMVersion" ;
   echo "DSM update:       $DSMUpgradeAvailable" ;
   echo "System Status:       $systemStatus" ;
   echo "Temperature:       $temperature" ;
   echo "Power Status:       $powerStatus" ;
   echo "System Fan Status:    $systemFanStatus" ;
   echo "CPU Fan Status:       $CPUFanStatus" ;
   echo "Number of disks:         $nbDisk" ;
   for i in `seq 1 $nbDisk`;
       do
      echo " ${diskID[$i]} (model:${diskModel[$i]}) status:${diskStatus[$i]} temperature:${diskTemp[$i]}" ;
   done
   echo "Number of RAID volume:   $nbRAID" ;
   for i in `seq 1 $nbRAID`;
       do
      echo " ${RAIDName[$i]} status:${RAIDStatus[$i]} ${storagePercentUsedString[$i]}" ;
   done

   # Display UPS information
       # if [ "$ups" = "yes" ] ; then
           #  upsModel=$(echo "$syno" | grep $OID_UpsModel | cut -d "=" -f2)
            # upsSN=$(echo "$syno" | grep $OID_UpsSN | cut -d "=" -f2)
           #  upsStatus=$(echo "$syno" | grep $OID_UpsStatus | cut -d "=" -f2)
           #  upsLoad=$(echo "$syno" | grep $OID_UpsLoad | cut -d "=" -f2)
         #    upsBatteryCharge=$(echo "$syno" | grep $OID_UpsBatteryCharge | cut -d "=" -f2)
         #    upsBatteryChargeWarning=$(echo "$syno" | grep $OID_UpsBatteryChargeWarning | cut -d "=" -f2)

      #  echo "UPS:"
      #  echo "  Model:      $upsModel"
      #  echo "  s/n:         $upsSN"
     #   echo "  Status:      $upsStatus"
      #  echo "  Load:         $upsLoad"
     #   echo "  Battery charge:   $upsBatteryCharge"
      #  echo "  Battery charge warning:$upsBatteryChargeWarning"
   # fi

   # echo "";
   #  fi

   #  if [ "$healthCriticalStatus" = "1" ] ; then
   # echo "CRITICAL - $healthString"
   # exit 2
   #  fi
   #  if [ "$healthWarningStatus" = "1" ] ; then
   # echo "WARNING - $healthString"
   # exit 1
   #  fi
   #  if [ "$healthCriticalStatus" = "0" ] && [ "$healthWarningStatus" = "0" ] ; then
   # echo "OK - $healthString is in good health"
   # exit 0
  #   fi
fi
chris1337c
 
Posts: 48
Joined: Wed Dec 26, 2018 2:31 pm

Re: check_snmp_synology - False Positives

Postby chris1337c » Fri Dec 28, 2018 11:37 am

Here is the original file any suggestions would be greatly appreciated:
Code: Select all
SNMPWALK=$(which snmpwalk)
SNMPGET=$(which snmpget)

SNMPVersion="3"
SNMPV2Community="public"
SNMPTimeout="15"
warningTemperature="50"
criticalTemperature="60"
warningStorage="80"
criticalStorage="95"
hostname=""
healthWarningStatus=0
healthCriticalStatus=0
healthString=""
verbose="no"
checkDSMUpdate="yes"
ups="no"

#OID declarations
OID_syno="1.3.6.1.4.1.6574"
OID_model="1.3.6.1.4.1.6574.1.5.1.0"
OID_serialNumber="1.3.6.1.4.1.6574.1.5.2.0"
OID_DSMVersion="1.3.6.1.4.1.6574.1.5.3.0"
OID_DSMUpgradeAvailable="1.3.6.1.4.1.6574.1.5.4.0"
OID_systemStatus="1.3.6.1.4.1.6574.1.1.0"
OID_temperature="1.3.6.1.4.1.6574.1.2.0"
OID_powerStatus="1.3.6.1.4.1.6574.1.3.0"
OID_systemFanStatus="1.3.6.1.4.1.6574.1.4.1.0"
OID_CPUFanStatus="1.3.6.1.4.1.6574.1.4.2.0"

OID_disk=""
OID_disk2=""
OID_diskID="1.3.6.1.4.1.6574.2.1.1.2"
OID_diskModel="1.3.6.1.4.1.6574.2.1.1.3"
OID_diskStatus="1.3.6.1.4.1.6574.2.1.1.5"
OID_diskTemp="1.3.6.1.4.1.6574.2.1.1.6"

OID_RAID=""
OID_RAIDName="1.3.6.1.4.1.6574.3.1.1.2"
OID_RAIDStatus="1.3.6.1.4.1.6574.3.1.1.3"

OID_Storage="1.3.6.1.2.1.25.2.3.1"
OID_StorageDesc="1.3.6.1.2.1.25.2.3.1.3"
OID_StorageAllocationUnits="1.3.6.1.2.1.25.2.3.1.4"
OID_StorageSize="1.3.6.1.2.1.25.2.3.1.5"
OID_StorageSizeUsed="1.3.6.1.2.1.25.2.3.1.6"

OID_UpsModel="1.3.6.1.4.1.6574.4.1.1.0"
OID_UpsSN="1.3.6.1.4.1.6574.4.1.3.0"
OID_UpsStatus="1.3.6.1.4.1.6574.4.2.1.0"
OID_UpsLoad="1.3.6.1.4.1.6574.4.2.12.1.0"
OID_UpsBatteryCharge="1.3.6.1.4.1.6574.4.3.1.1.0"
OID_UpsBatteryChargeWarning="1.3.6.1.4.1.6574.4.3.1.4.0"

usage()
{
        echo "usage: ./check_snmp_synology [OPTIONS] -u [user] -p [pass] -h [hostname]"
        echo "options:"
        echo "            -u [snmp username]      Username for SNMPv3"
        echo "            -p [snmp password]      Password for SNMPv3"
        echo ""
        echo "            -2 [community name]        Use SNMPv2 (no need user/password) & define community name (ex: public)"
        echo ""
        echo "            -h [hostname or IP](:port)   Hostname or IP. You can also define a different port"
        echo ""
        echo "            -W [warning temp]      Warning temperature (for disks & synology) (default $warningTemperature)"
        echo "            -C [critical temp]      Critical temperature (for disks & synology) (default $criticalTemperature)"
        echo ""
        echo "            -w [warning %]      Warning storage usage percentage (default $warningStorage)"
        echo "            -c [critical %]      Critical storage usage percentage (default $criticalStorage)"
        echo ""
        echo "            -i            Ignore DSM updates"
        echo "            -U            Show informations about the connected UPS (only information, no control)"
        echo "            -v            Verbose - print all informations about your Synology"
        echo ""
        echo ""
        echo "examples:   ./check_snmp_synology -u admin -p 1234 -h nas.intranet"   
        echo "           ./check_snmp_synology -u admin -p 1234 -h nas.intranet -v"   
        echo "      ./check_snmp_synology -2 public -h nas.intranet"   
        echo "      ./check_snmp_synology -2 public -h nas.intranet:10161"
        exit 3
}

if [ "$1" == "--help" ]; then
    usage; exit 0
fi

while getopts 2:W:C:w:c:u:p:h:iUv OPTNAME; do
        case "$OPTNAME" in
   u)   SNMPUser="$OPTARG";;
        p)   SNMPPassword="$OPTARG";;
        h)   hostname="$OPTARG";;
        v)   verbose="yes";;
   2)   SNMPVersion="2"
      SNMPV2Community="$OPTARG";;
   w)   warningStorage="$OPTARG";;
        c)      criticalStorage="$OPTARG";;
   W)   warningTemperature="$OPTARG";;
   C)   criticalTemperature="$OPTARG";;
   i)   checkDSMUpdate="no";;
   U)   ups="yes";;
        *)   usage;;
        esac
done

if [ "$warningTemperature" -gt "$criticalTemperature" ] ; then
    echo "Critical temperature must be higher than warning temperature"
    echo "Warning temperature: $warningTemperature"
    echo "Critical temperature: $criticalTemperature"
    echo ""
    echo "For more information:  ./${0##*/} --help"
    exit 1
fi

if [ "$warningStorage" -gt "$criticalStorage" ] ; then
    echo "The Critical storage usage percentage  must be higher than the warning storage usage percentage"
    echo "Warning: $warningStorage"
    echo "Critical: $criticalStorage"
    echo ""
    echo "For more information:  ./${0##*/} --help"
    exit 1
fi

if [ "$hostname" = "" ] || ([ "$SNMPVersion" = "3" ] && [ "$SNMPUser" = "" ]) || ([ "$SNMPVersion" = "3" ] && [ "$SNMPPassword" = "" ]) ; then
    usage
else
    if [ "$SNMPVersion" = "2" ] ; then
   SNMPArgs=" -OQne -v 2c -c $SNMPV2Community -t $SNMPTimeout"
    else
   SNMPArgs=" -OQne -v 3 -u $SNMPUser -A $SNMPPassword -l authNoPriv -a MD5 -t $SNMPTimeout"
   if [ ${#SNMPPassword} -lt "8" ] ; then
       echo "snmpwalk:  (The supplied password is too short.)"
       exit 1
   fi
    fi
    tmpRequest=`$SNMPWALK $SNMPArgs $hostname $OID_syno 2> /dev/null`
    if [ "$?" != "0" ] ; then
        echo "CRITICAL - Problem with SNMP request, check user/password/host"
        exit 2
    fi
    nbDisk=$(echo "$tmpRequest" | grep $OID_diskID | wc -l)
    nbRAID=$(echo "$tmpRequest" | grep $OID_RAIDName | wc -l)

    for i in `seq 1 $nbDisk`;
    do
   if [ $i -lt 25 ] ; then
       OID_disk="$OID_disk $OID_diskID.$(($i-1)) $OID_diskModel.$(($i-1)) $OID_diskStatus.$(($i-1)) $OID_diskTemp.$(($i-1)) "
   else
       OID_disk2="$OID_disk2 $OID_diskID.$(($i-1)) $OID_diskModel.$(($i-1)) $OID_diskStatus.$(($i-1)) $OID_diskTemp.$(($i-1)) "
   fi   
    done

    for i in `seq 1 $nbRAID`;
    do
      OID_RAID="$OID_RAID $OID_RAIDName.$(($i-1)) $OID_RAIDStatus.$(($i-1))"
    done

    if [ "$ups" = "yes" ] ; then
   syno=`$SNMPGET $SNMPArgs $hostname $OID_model $OID_serialNumber $OID_DSMVersion $OID_systemStatus $OID_temperature $OID_powerStatus $OID_systemFanStatus $OID_CPUFanStatus $OID_disk $OID_RAID $OID_DSMUpgradeAvailable $OID_UpsModel $OID_UpsSN $OID_UpsStatus $OID_UpsLoad $OID_UpsBatteryCharge $OID_UpsBatteryChargeWarning 2> /dev/null`
    else
   syno=`$SNMPGET $SNMPArgs $hostname $OID_model $OID_serialNumber $OID_DSMVersion $OID_systemStatus $OID_temperature $OID_powerStatus $OID_systemFanStatus $OID_CPUFanStatus $OID_disk $OID_RAID $OID_DSMUpgradeAvailable 2> /dev/null`
    fi

    if [ "$OID_disk2" != "" ]; then
   syno2=`$SNMPGET $SNMPArgs $hostname $OID_disk2 2> /dev/null`
   syno=$(echo "$syno";echo "$syno2";)
    fi

    model=$(echo "$syno" | grep $OID_model | cut -d "=" -f2)
    serialNumber=$(echo "$syno" | grep $OID_serialNumber | cut -d "=" -f2)
    DSMVersion=$(echo "$syno" | grep $OID_DSMVersion | cut -d "=" -f2)

    healthString="Synology $model (s/n:$serialNumber, $DSMVersion)"

    DSMUpgradeAvailable=$(echo "$syno" | grep $OID_DSMUpgradeAvailable | cut -d "=" -f2 | sed 's/^[ \t]*//;s/[ \t]*$//')
    case $DSMUpgradeAvailable in
   "1")   DSMUpgradeAvailable="Available";   if [ "$checkDSMUpdate" = "yes" ]; then healthWarningStatus=1;      healthString="$healthString, DSM update available"; fi ;;
   "2")   DSMUpgradeAvailable="Unavailable";;
   "3")   DSMUpgradeAvailable="Connecting";;               
   "4")   DSMUpgradeAvailable="Disconnected";   if [ "$checkDSMUpdate" = "yes" ]; then healthWarningStatus=1;      healthString="$healthString, DSM Update Disconnected"; fi ;;
   "5")   DSMUpgradeAvailable="Others";      if [ "$checkDSMUpdate" = "yes" ]; then healthWarningStatus=1;      healthString="$healthString, Check DSM Update"; fi ;;
    esac

    RAIDName=$(echo "$syno" | grep $OID_RAIDName | cut -d "=" -f2)
    RAIDStatus=$(echo "$syno" | grep $OID_RAIDStatus | cut -d "=" -f2 | sed 's/^[ \t]*//;s/[ \t]*$//')
    systemStatus=$(echo "$syno" | grep $OID_systemStatus | cut -d "=" -f2 | sed 's/^[ \t]*//;s/[ \t]*$//')
    temperature=$(echo "$syno" | grep $OID_temperature | cut -d "=" -f2 | sed 's/^[ \t]*//;s/[ \t]*$//')
    powerStatus=$(echo "$syno" | grep $OID_powerStatus | cut -d "=" -f2 | sed 's/^[ \t]*//;s/[ \t]*$//')
    systemFanStatus=$(echo "$syno" | grep $OID_systemFanStatus | cut -d "=" -f2 | sed 's/^[ \t]*//;s/[ \t]*$//')
    CPUFanStatus=$(echo "$syno" | grep $OID_CPUFanStatus | cut -d "=" -f2 | sed 's/^[ \t]*//;s/[ \t]*$//')


    #Check system status
    if [ "$systemStatus" = "1" ] ; then
   systemStatus="Normal"
    else
   systemStatus="Failed"
        healthCriticalStatus=1
        healthString="$healthString, System status: $systemStatus "
    fi

    #Check system temperature
    if [ "$temperature" -gt "$warningTemperature" ] ; then
       if [ "$temperature" -gt "$criticalTemperature" ] ; then
           temperature="$temperature (CRITICAL)"
           healthCriticalStatus=1
           healthString="$healthString, temperature: $temperature "
   else
           temperature="$temperature (WARNING)"
           healthWarningStatus=1
           healthString="$healthString, temperature: $temperature "
   fi
    else
   temperature="$temperature (Normal)"
    fi


    #Check power status
    if [ "$powerStatus" = "1" ] ; then
        powerStatus="Normal"
    else
          powerStatus="Failed";
        healthCriticalStatus=1
        healthString="$healthString, Power status: $powerStatus "
    fi


    #Check system fan status
    if [ "$systemFanStatus" = "1" ] ; then
        systemFanStatus="Normal"
    else
        systemFanStatus="Failed";      
        healthCriticalStatus=1
        healthString="$healthString, System fan status: $systemFanStatus "
    fi
   

    #Check CPU fan status
    if [ "$CPUFanStatus" = "1" ] ; then
   CPUFanStatus="Normal"
    else
        CPUFanStatus="Failed";      
        healthCriticalStatus=1
        healthString="$healthString, CPU fan status: $CPUFanStatus "
    fi


    #Check all disk status
    for i in `seq 1 $nbDisk`;
    do
       diskID[$i]=$(echo "$syno" | grep "$OID_diskID.$(($i-1)) " | cut -d "=" -f2)
       diskModel[$i]=$(echo "$syno" | grep "$OID_diskModel.$(($i-1)) " | cut -d "=" -f2 )
       diskStatus[$i]=$(echo "$syno" | grep "$OID_diskStatus.$(($i-1)) " | cut -d "=" -f2 | sed 's/^[ \t]*//;s/[ \t]*$//')
       diskTemp[$i]=$(echo "$syno" | grep "$OID_diskTemp.$(($i-1)) " | cut -d "=" -f2 | sed 's/^[ \t]*//;s/[ \t]*$//')

   case ${diskStatus[$i]} in
      "1")   diskStatus[$i]="Normal";      ;;
      "2")   diskStatus[$i]="Initialized";      ;;
      "3")   diskStatus[$i]="NotInitialized";   ;;
      "4")   diskStatus[$i]="SystemPartitionFailed";   healthCriticalStatus=1; healthString="$healthString, problem with ${diskID[$i]} (model:${diskModel[$i]}) status:${diskStatus[$i]} temperature:${diskTemp[$i]}";;
      "5")   diskStatus[$i]="Crashed";      healthCriticalStatus=1;   healthString="$healthString, problem with ${diskID[$i]} (model:${diskModel[$i]}) status:${diskStatus[$i]} temperature:${diskTemp[$i]}";;
   esac

   if [ "${diskTemp[$i]}" -gt "$warningTemperature" ] ; then
       if [ "${diskTemp[$i]}" -gt "$criticalTemperature" ] ; then
      diskTemp[$i]="${diskTemp[$i]} (CRITICAL)"
      healthCriticalStatus=1;
      healthString="$healthString, ${diskID[$i]} temperature: ${diskTemp[$i]}"
       else
      diskTemp[$i]="${diskTemp[$i]} (WARNING)"
      healthWarningStatus=1;
      healthString="$healthString, ${diskID[$i]} temperature: ${diskTemp[$i]}"
       fi
   fi

     done 

    syno_diskspace=`$SNMPWALK $SNMPArgs $hostname $OID_Storage 2> /dev/null`

    #Check all RAID volume status
    for i in `seq 1 $nbRAID`;
    do
   RAIDName[$i]=$(echo "$syno" | grep $OID_RAIDName.$(($i-1)) | cut -d "=" -f2)
   RAIDStatus[$i]=$(echo "$syno" | grep $OID_RAIDStatus.$(($i-1)) | cut -d "=" -f2 | sed 's/^[ \t]*//;s/[ \t]*$//')

   storageName[$i]=$(echo "${RAIDName[$i]}" | sed -e 's/[[:blank:]]//g' | sed -e 's/\"//g' | sed 's/.*/\L&/')

   # modified by Tobias Schenke
   # "timebackup" (when backup-job runs) and the "docker-feature" (since dsm 6.0, and if installed) mount volumes as a substructure of /"volume1/..." or "/.../volume1/..."
   # in this case the former grep failed with more then one result.
   # modified script to look for a line with '= "/volume1"' instead of 'volume1'
   #storageID[$i]=$(echo "$syno_diskspace" | grep ${storageName[$i]} | cut -d "=" -f1 | rev | cut -d "." -f1 | rev)
   storageID[$i]=$(echo "$syno_diskspace" | grep "= \"\?/${storageName[$i]}\"\?" | cut -d "=" -f1 | rev | cut -d "." -f1 | rev)

   if [ "${storageID[$i]}" != "" ] ; then
       storageSize[$i]=$(echo "$syno_diskspace" | grep "$OID_StorageSize.${storageID[$i]}" | cut -d "=" -f2 )
       storageSizeUsed[$i]=$(echo "$syno_diskspace" | grep "$OID_StorageSizeUsed.${storageID[$i]}" | cut -d "=" -f2 )
       storageAllocationUnits[$i]=$(echo "$syno_diskspace" | grep "$OID_StorageAllocationUnits.${storageID[$i]}" | cut -d "=" -f2 )
       storagePercentUsed[$i]=$((${storageSizeUsed[$i]} * 100 / ${storageSize[$i]}))
       storagePercentUsedString[$i]="${storagePercentUsed[$i]}% used"

       if [ "${storagePercentUsed[$i]}" -gt "$warningStorage" ] ; then
          if [ "${storagePercentUsed[$i]}" -gt "$criticalStorage" ] ; then
                    healthCriticalStatus=1;
          storagePercentUsedString[$i]="${storagePercentUsedString[$i]} CRITICAL"
                    healthString="$healthString,${RAIDName[$i]}: ${storagePercentUsedString[$i]}"
               else
                    healthWarningStatus=1;
          storagePercentUsedString[$i]="${storagePercentUsedString[$i]} WARNING"
                    healthString="$healthString,${RAIDName[$i]}: ${storagePercentUsedString[$i]}"
               fi
            fi
   fi

        case ${RAIDStatus[$i]} in
      "1")   RAIDStatus[$i]="Normal";      ;;
      "2")   RAIDStatus[$i]="Repairing";      healthWarningStatus=1;      healthString="$healthString, RAID status (${RAIDName[$i]}): ${RAIDStatus[$i]} ";;
      "3")   RAIDStatus[$i]="Migrating";      healthWarningStatus=1;      healthString="$healthString, RAID status (${RAIDName[$i]}): ${RAIDStatus[$i]} ";;
      "4")   RAIDStatus[$i]="Expanding";      healthWarningStatus=1;      healthString="$healthString, RAID status (${RAIDName[$i]}): ${RAIDStatus[$i]} ";;
      "5")   RAIDStatus[$i]="Deleting";      healthWarningStatus=1;      healthString="$healthString, RAID status (${RAIDName[$i]}): ${RAIDStatus[$i]} ";;
      "6")   RAIDStatus[$i]="Creating";      healthWarningStatus=1;      healthString="$healthString, RAID status (${RAIDName[$i]}): ${RAIDStatus[$i]} ";;
      "7")   RAIDStatus[$i]="RaidSyncing";      healthWarningStatus=1;      healthString="$healthString, RAID status (${RAIDName[$i]}): ${RAIDStatus[$i]} ";;
      "8")   RAIDStatus[$i]="RaidParityChecking";   healthWarningStatus=1;      healthString="$healthString, RAID status (${RAIDName[$i]}): ${RAIDStatus[$i]} ";;
      "9")   RAIDStatus[$i]="RaidAssembling";   healthWarningStatus=1;      healthString="$healthString, RAID status (${RAIDName[$i]}): ${RAIDStatus[$i]} ";;
      "10")   RAIDStatus[$i]="Canceling";      healthWarningStatus=1;      healthString="$healthString, RAID status (${RAIDName[$i]}): ${RAIDStatus[$i]} ";;
      "11")   RAIDStatus[$i]="Degrade";      healthCriticalStatus=1;      healthString="$healthString, RAID status (${RAIDName[$i]}): ${RAIDStatus[$i]} ";;
      "12")   RAIDStatus[$i]="Crashed";      healthCriticalStatus=1;      healthString="$healthString, RAID status (${RAIDName[$i]}): ${RAIDStatus[$i]} ";;
        esac
    done

    if [ "$verbose" = "yes" ] ; then   
   echo "Synology model:      $model" ;
   echo "Synology s/n:      $serialNumber" ;
   echo "DSM Version:      $DSMVersion" ;
   echo "DSM update:       $DSMUpgradeAvailable" ;
   echo "System Status:       $systemStatus" ;
   echo "Temperature:       $temperature" ;
   echo "Power Status:       $powerStatus" ;
   echo "System Fan Status:    $systemFanStatus" ;
   echo "CPU Fan Status:       $CPUFanStatus" ;
   echo "Number of disks:         $nbDisk" ;
   for i in `seq 1 $nbDisk`;
       do
      echo " ${diskID[$i]} (model:${diskModel[$i]}) status:${diskStatus[$i]} temperature:${diskTemp[$i]}" ;
   done
   echo "Number of RAID volume:   $nbRAID" ;
   for i in `seq 1 $nbRAID`;
       do
      echo " ${RAIDName[$i]} status:${RAIDStatus[$i]} ${storagePercentUsedString[$i]}" ;
   done

   # Display UPS information
       if [ "$ups" = "yes" ] ; then
            upsModel=$(echo "$syno" | grep $OID_UpsModel | cut -d "=" -f2)
            upsSN=$(echo "$syno" | grep $OID_UpsSN | cut -d "=" -f2)
            upsStatus=$(echo "$syno" | grep $OID_UpsStatus | cut -d "=" -f2)
            upsLoad=$(echo "$syno" | grep $OID_UpsLoad | cut -d "=" -f2)
            upsBatteryCharge=$(echo "$syno" | grep $OID_UpsBatteryCharge | cut -d "=" -f2)
            upsBatteryChargeWarning=$(echo "$syno" | grep $OID_UpsBatteryChargeWarning | cut -d "=" -f2)

       echo "UPS:"
       echo "  Model:      $upsModel"
       echo "  s/n:         $upsSN"
       echo "  Status:      $upsStatus"
       echo "  Load:         $upsLoad"
       echo "  Battery charge:   $upsBatteryCharge"
       echo "  Battery charge warning:$upsBatteryChargeWarning"
   fi

   echo "";
    fi

    if [ "$healthCriticalStatus" = "1" ] ; then
   echo "CRITICAL - $healthString"
   exit 2
    fi
    if [ "$healthWarningStatus" = "1" ] ; then
   echo "WARNING - $healthString"
   exit 1
    fi
    if [ "$healthCriticalStatus" = "0" ] && [ "$healthWarningStatus" = "0" ] ; then
   echo "OK - $healthString is in good health"
   exit 0
    fi
fi
chris1337c
 
Posts: 48
Joined: Wed Dec 26, 2018 2:31 pm

Re: check_snmp_synology - False Positives

Postby cdienger » Fri Dec 28, 2018 2:44 pm

I would keep the original file as is. It shouldn't add any additional traffic.

Where were you seeing the timeout messages? You can keep an eye out for them by running:

tail -f /usr/local/nagios/var/nagios.log | grep "DC_****"
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
User avatar
cdienger
Support Tech
 
Posts: 2318
Joined: Tue Feb 07, 2017 11:26 am

Re: check_snmp_synology - False Positives

Postby chris1337c » Fri Dec 28, 2018 2:51 pm

The global timeout is 180s. Consistently the synology throws false positives for example:

This morning
[12-28-2018 02:52:52] SERVICE ALERT: DC_SAN;Global Health Status;CRITICAL;HARD;3;(Service check timed out after 180.02 seconds)
Service Notification[12-28-2018 02:52:52] SERVICE NOTIFICATION: ccichon;DC_SAN;Global Health Status;CRITICAL;notify-service-by-email;(Service check timed out after 180.02 seconds)
Service Notification[12-28-2018 02:52:52] SERVICE NOTIFICATION: nagiosadmin;DC_SAN;Global Health Status;CRITICAL;notify-service-by-email;(Service check timed out after 180.02 seconds)
Informational Message[12-28-2018 02:52:52] Warning: Check of service 'Global Health Status' on host 'DC_SAN' timed out after 180.019s!
Informational Message[12-28-2018 02:52:52] wproc: early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
Informational Message[12-28-2018 02:52:52] wproc: host=DC_SAN; service=Global Health Status;
Informational Message[12-28-2018 02:52:52] wproc: CHECK job 137580 from worker Core Worker 14080 timed out after 180.02s
Informational Message[12-28-2018 02:52:52] wproc: Core Worker 14080: job 137580 (pid=58305) timed out. Killing it
Informational Message[12-28-2018 02:47:52] wproc: Core Worker 14081: job 137537 (pid=52055): Dormant child reaped
Service Critical[12-28-2018 02:47:52] SERVICE ALERT: DC_SAN;Global Health Status;CRITICAL;SOFT;2;(Service check timed out after 180.04 seconds)
Informational Message[12-28-2018 02:47:52] Warning: Check of service 'Global Health Status' on host 'DC_SAN' timed out after 180.035s!
Informational Message[12-28-2018 02:47:52] wproc: early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
Informational Message[12-28-2018 02:47:52] wproc: host=DC_SAN; service=Global Health Status;
Informational Message[12-28-2018 02:47:52] wproc: CHECK job 137537 from worker Core Worker 14081 timed out after 180.04s
Informational Message[12-28-2018 02:47:52] wproc: Core Worker 14081: job 137537 (pid=52055) timed out. Killing it
Informational Message[12-28-2018 02:46:18] Auto-save of retention data completed successfully.
Informational Message[12-28-2018 02:42:52] wproc: Core Worker 14078: job 137492 (pid=45830): Dormant child reaped
Service Critical[12-28-2018 02:42:52] SERVICE ALERT: DC_SAN;Global Health Status;CRITICAL;SOFT;1;(Service check timed out after 180.02 seconds)
Informational Message[12-28-2018 02:42:52] Warning: Check of service 'Global Health Status' on host 'DC_SAN' timed out after 180.017s!
Informational Message[12-28-2018 02:42:52] wproc: early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
Informational Message[12-28-2018 02:42:52] wproc: host=DC_SAN; service=Global Health Status;
Informational Message[12-28-2018 02:42:52] wproc: CHECK job 137492 from worker Core Worker 14078 timed out after 180.02s
Informational Message[12-28-2018 02:42:52] wproc: Core Worker 14078: job 137492 (pid=45830) timed out. Killing it
chris1337c
 
Posts: 48
Joined: Wed Dec 26, 2018 2:31 pm

Re: check_snmp_synology - False Positives

Postby chris1337c » Fri Dec 28, 2018 2:52 pm

I will stop the TCPDump on Monday and compare to the logs, I will probably require some assistance to find the root cause. I am currently checking all of the backup schedules to try and space them out, I wonder if the Synology just cannot handle the backup/replication load and causes the snmp timeouts due to being overloaded with backups? Or is this not possible the way Snmp works.
chris1337c
 
Posts: 48
Joined: Wed Dec 26, 2018 2:31 pm

PreviousNext

Return to Nagios Core

Who is online

Users browsing this forum: Bing [Bot] and 16 guests