Page 1 of 2

Issues with passive checking in Nagios

Posted: Mon Mar 10, 2014 2:40 pm
by mkot
Hi, I've got Nagios machine which is getting passive notifications from defined hosts via NSCA. Everything would be good, but some hosts get crazy:

Code: Select all

Mar 10 20:20:42 nagios nsca[4877]: SERVICE CHECK -> Host Name: 'PN-ACIESIELSKI1', Service Description: 'MEM Load', Return Code: '0', Output: 'OK: physical memory: Total: 2.97G - Used: 880M (28%) - Free: 2.11G (72%)|'physical memory %'=28%;80;90 'physical memory'=879.5M;2429.69;2733.4;0;3037.11'
Mar 10 20:20:42 nagios nagios: EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;PN-ACIESIELSKI1;MEM Load;0;OK: physical memory: Total: 2.97G - Used: 880M (28%) - Free: 2.11G (72%)|'physical memory %'=28%;80;90 'physical memory'=879.5M;2429.69;2733.4;0;3037.11
Mar 10 20:20:42 nagios nagios: PASSIVE SERVICE CHECK: PN-ACIESIELSKI1;MEM Load;0;OK: physical memory: Total: 2.97G - Used: 880M (28%) - Free: 2.11G (72%)
Mar 10 20:20:42 nagios nagios: SERVICE ALERT: PN-ACIESIELSKI1;MEM Load;OK;HARD;3;OK: physical memory: Total: 2.97G - Used: 880M (28%) - Free: 2.11G (72%)
Mar 10 20:21:53 nagios nagios: SERVICE ALERT: PN-ACIESIELSKI1;DISK Load C;CRITICAL;HARD;3;(No output on stdout) stderr: execvp(/usr/local/nagios/libexec/check_nrpe, ...) failed. errno is 2: No such file or directory
Mar 10 20:21:53 nagios nagios: SERVICE NOTIFICATION: nagiosadmin;PN-ACIESIELSKI1;DISK Load C;CRITICAL;notify-service-by-email;(No output on stdout) stderr: execvp(/usr/local/nagios/libexec/check_nrpe, ...) failed. errno is 2: No such file or directory
Mar 10 20:22:55 nagios nsca[5076]: SERVICE CHECK -> Host Name: 'PN-ACIESIELSKI1', Service Description: 'DISK Load C', Return Code: '0', Output: 'OK: All drives within bounds.|'C: %'=61%;10;5 'C:'=38.62G;9.75999;4.87999;0;97.65'
Mar 10 20:22:55 nagios nagios: EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;PN-ACIESIELSKI1;DISK Load C;0;OK: All drives within bounds.|'C: %'=61%;10;5 'C:'=38.62G;9.75999;4.87999;0;97.65
Mar 10 20:22:55 nagios nagios: PASSIVE SERVICE CHECK: PN-ACIESIELSKI1;DISK Load C;0;OK: All drives within bounds.
Mar 10 20:22:55 nagios nagios: SERVICE ALERT: PN-ACIESIELSKI1;DISK Load C;OK;HARD;3;OK: All drives within bounds.
Mar 10 20:25:49 nagios nsca[5386]: SERVICE CHECK -> Host Name: 'PN-ACIESIELSKI1', Service Description: 'CPU Load', Return Code: '0', Output: 'OK CPU Load ok.|'5m'=4%;80;90 '1m'=3%;80;90 '30s'=4%;80;90'
Mar 10 20:25:49 nagios nagios: EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;PN-ACIESIELSKI1;CPU Load;0;OK CPU Load ok.|'5m'=4%;80;90 '1m'=3%;80;90 '30s'=4%;80;90
Mar 10 20:25:49 nagios nagios: PASSIVE SERVICE CHECK: PN-ACIESIELSKI1;CPU Load;0;OK CPU Load ok.
Mar 10 20:25:49 nagios nagios: SERVICE ALERT: PN-ACIESIELSKI1;CPU Load;OK;HARD;3;OK CPU Load ok.
Mar 10 20:30:05 nagios nagios: SERVICE ALERT: PN-ACIESIELSKI1;MEM Load;CRITICAL;SOFT;1;(No output on stdout) stderr: execvp(/usr/local/nagios/libexec/check_nrpe, ...) failed. errno is 2: No such file or directory
Mar 10 20:30:36 nagios nagios: SERVICE ALERT: PN-ACIESIELSKI1;CPU Load;CRITICAL;SOFT;1;(No output on stdout) stderr: execvp(/usr/local/nagios/libexec/check_nrpe, ...) failed. errno is 2: No such file or directory
Mar 10 20:30:41 nagios nsca[5904]: SERVICE CHECK -> Host Name: 'PN-ACIESIELSKI1', Service Description: 'MEM Load', Return Code: '0', Output: 'OK: physical memory: Total: 2.97G - Used: 879M (28%) - Free: 2.11G (72%)|'physical memory %'=28%;80;90 'physical memory'=879.06M;2429.69;2733.4;0;3037.11'
Mar 10 20:30:41 nagios nagios: EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;PN-ACIESIELSKI1;MEM Load;0;OK: physical memory: Total: 2.97G - Used: 879M (28%) - Free: 2.11G (72%)|'physical memory %'=28%;80;90 'physical memory'=879.06M;2429.69;2733.4;0;3037.11
Mar 10 20:30:41 nagios nagios: PASSIVE SERVICE CHECK: PN-ACIESIELSKI1;MEM Load;0;OK: physical memory: Total: 2.97G - Used: 879M (28%) - Free: 2.11G (72%)
Mar 10 20:30:41 nagios nagios: SERVICE ALERT: PN-ACIESIELSKI1;MEM Load;OK;SOFT;2;OK: physical memory: Total: 2.97G - Used: 879M (28%) - Free: 2.11G (72%)
Mar 10 20:31:53 nagios nagios: SERVICE ALERT: PN-ACIESIELSKI1;DISK Load C;CRITICAL;SOFT;1;(No output on stdout) stderr: execvp(/usr/local/nagios/libexec/check_nrpe, ...) failed. errno is 2: No such file or directory
Mar 10 20:32:05 nagios nagios: SERVICE ALERT: PN-ACIESIELSKI1;MEM Load;CRITICAL;SOFT;1;(No output on stdout) stderr: execvp(/usr/local/nagios/libexec/check_nrpe, ...) failed. errno is 2: No such file or directory
Mar 10 20:32:36 nagios nagios: SERVICE ALERT: PN-ACIESIELSKI1;CPU Load;CRITICAL;SOFT;2;(No output on stdout) stderr: execvp(/usr/local/nagios/libexec/check_nrpe, ...) failed. errno is 2: No such file or directory
Mar 10 20:32:55 nagios nsca[6114]: SERVICE CHECK -> Host Name: 'PN-ACIESIELSKI1', Service Description: 'DISK Load C', Return Code: '0', Output: 'OK: All drives within bounds.|'C: %'=61%;10;5 'C:'=38.62G;9.75999;4.87999;0;97.65'
Mar 10 20:32:55 nagios nagios: EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;PN-ACIESIELSKI1;DISK Load C;0;OK: All drives within bounds.|'C: %'=61%;10;5 'C:'=38.62G;9.75999;4.87999;0;97.65
Mar 10 20:32:55 nagios nagios: PASSIVE SERVICE CHECK: PN-ACIESIELSKI1;DISK Load C;0;OK: All drives within bounds.
Mar 10 20:32:55 nagios nagios: SERVICE ALERT: PN-ACIESIELSKI1;DISK Load C;OK;SOFT;2;OK: All drives within bounds.

I don't understand what is happening, firstly I get sth like that:

Code: Select all

Mar 10 20:20:42 nagios nagios: PASSIVE SERVICE CHECK: PN-ACIESIELSKI1;MEM Load;0;OK: physical memory: Total: 2.97G - Used: 880M (28%) - Free: 2.11G (72%)
Mar 10 20:20:42 nagios nagios: SERVICE ALERT: PN-ACIESIELSKI1;MEM Load;OK;HARD;3;OK: physical memory: Total: 2.97G - Used: 880M (28%) - Free: 2.11G (72%)
And then I get sth like that:

Code: Select all

Mar 10 20:31:53 nagios nagios: SERVICE ALERT: PN-ACIESIELSKI1;DISK Load C;CRITICAL;SOFT;1;(No output on stdout) stderr: execvp(/usr/local/nagios/libexec/check_nrpe, ...) failed. errno is 2: No such file or directory
Mar 10 20:32:05 nagios nagios: SERVICE ALERT: PN-ACIESIELSKI1;MEM Load;CRITICAL;SOFT;1;(No output on stdout) stderr: execvp(/usr/local/nagios/libexec/check_nrpe, ...) failed. errno is 2: No such file or directory
Mar 10 20:32:36 nagios nagios: SERVICE ALERT: PN-ACIESIELSKI1;CPU Load;CRITICAL;SOFT;2;(No output on stdout) stderr: execvp(/usr/local/nagios/libexec/check_nrpe, ...) failed. errno is 2: No such file or directory
Can anyone tell me WTF???

What else information do you need to help me?

Re: Issues with passive checking in Nagios

Posted: Mon Mar 10, 2014 3:58 pm
by sreinhardt
It looks like you've got a mix of nrpe and nsca, if we could start with just one issue first. Could you post one of the nsca configs that is run on your remote systems having issues? Also are you seeing this fairly constantly, or just off and on? Have you noticed any reason or timing for when it does this?

Re: Issues with passive checking in Nagios

Posted: Tue Mar 11, 2014 12:32 am
by mkot
Hi, thanks for your reply. I'm using nsca for monitoring windows hosts (PC and NB) and nrpe for monitoring Windows Serwer hosts.

Here is config from this host (nsclient.ini):

Code: Select all

# If you want to fill this file with all avalible options run the following command:
#   nscp settings --generate --add-defaults --load-all
# If you want to activate a module and bring in all its options use:
#   nscp settings --activate-module <MODULE NAME> --add-defaults
# For details run: nscp settings --help


; Undocumented section
[/modules]

; CheckDisk - CheckDisk can check various file and disk related things. The current version has commands to check Size of hard drives and directories.
CheckDisk = 1

; Event log Checker. - Check for errors and warnings in the event log. This is only supported through NRPE so if you plan to use only NSClient this wont help you at all.
CheckEventLog = 1

; Check External Scripts - A simple wrapper to run external scripts and batch files.
CheckExternalScripts = 1

; Helper function - Various helper function to extend other checks. This is also only supported through NRPE.
CheckHelpers = 1

; Event log Checker. - Check for errors and warnings in the event log. This is only supported through NRPE so if you plan to use only NSClient this wont help you at all.
CheckLogFile = 0

; Check NSCP - Checkes the state of the agent
CheckNSCP = 1

; CheckSystem - Various system related checks, such as CPU load, process state, service state memory usage and PDH counters.
CheckSystem = 1

; CheckTaskSched - CheckTaskSched can check various file and disk related things. The current version has commands to check Size of hard drives and directories.
CheckTaskSched = 0

; CheckTaskSched2 - CheckTaskSched2 can check various file and disk related things. The current version has commands to check Size of hard drives and directories.
CheckTaskSched2 = 0

; CheckWMI - CheckWMI can check various file and disk related things. The current version has commands to check Size of hard drives and directories.
CheckWMI = 1

; DotnetPlugin - Plugin to load and manage plugins written in dot net.
DotnetPlugins = 0

; GraphiteClient - Graphite client
GraphiteClient = 0

; LUAScript - LUAScript...
LUAScript = 0

; NRDPClient - Passive check support over NRDP
NRDPClient = 0

; NRPE client - NRPE client 
NRPEClient = 0

; NRPE server - A simple server that listens for incoming NRPE connection and handles them.
NRPEServer = 1

; NSCAClient - Passive check support over NSCA.
NSCAClient = 1

; NSCA server (no encryption) - A simple server that listens for incoming NSCA connection and handles them.
NSCAServer = 1

; NSClient server - A simple server that listens for incoming NSClient (check_nt) connection and handles them. Although NRPE is the preferred method NSClient is fully supported and can be used for simplicity or for compatibility.
NSClientServer = 1

; SMTPClient - Passive check support via SMTP
SMTPClient = 0

; Scheduler - A scheduler which schedules checks at regular intervals
Scheduler = 1

; SimpleCache module - Caches results for later checking.
SimpleCache = 0

; SimpleFileWriter module - FileWriters results for later checking.
SimpleFileWriter = 0

; SyslogClient - Passive check support via Syslog
SyslogClient = 0


; Undocumented section
[/settings/default]

; ALLOWED CIPHERS - A better value is: ALL:!ADH:!LOW:!EXP:!MD5:@STRENGTH
allowed ciphers = ADH

; ALLOWED HOSTS - A comaseparated list of allowed hosts. You can use netmasks (/ syntax) or * to create ranges.
allowed hosts = 172.17.24.182
;allowed hosts = 192.168.20.120

; BIND TO ADDRESS - Allows you to bind server to a specific local address. This has to be a dotted ip address not a host name. Leaving this blank will bind to all available IP addresses.
bind to = 

; CACHE ALLOWED HOSTS - If hostnames should be cached, improves speed and security somewhat but wont allow you to have dynamic IPs for your nagios server.
cache allowed hosts = true

; SSL CERTIFICATE - 
certificate = 

; INBOX - The default channel to post incoming messages on
inbox = inbox

; PASSWORD - Password used to authenticate againast server
password = passwd

; TIMEOUT - Timeout when reading packets on incoming sockets. If the data has not arrived within this time we will bail out.
timeout = 30

; ENABLE SSL ENCRYPTION - This option controls if SSL should be enabled.
use ssl = true

; VERIFY MODE - 
verify mode = none


; A list of aliases available. An alias is an internal command that has been "wrapped" (to add arguments). Be careful so you don't create loops (ie check_loop=check_a, check_a=check_loop)
[/settings/external scripts/alias]


; alias_cpu - Alias for alias_cpu. To configure this item add a section called: /settings/external scripts/alias/alias_cpu
alias_cpu = checkCPU warn=80 crit=90 time=5m time=1m time=30s

; alias_cpu_ex - Alias for alias_cpu_ex. To configure this item add a section called: /settings/external scripts/alias/alias_cpu_ex
alias_cpu_ex = checkCPU warn=$ARG1$ crit=$ARG2$ time=5m time=1m time=30s

; alias_disk - Alias for alias_disk. To configure this item add a section called: /settings/external scripts/alias/alias_disk
;alias_disk = CheckDriveSize MinWarn=10% MinCrit=5% CheckAll FilterType=FIXED
alias_disk_c = CheckDriveSize MinWarn=10% MinCrit=5% Drive=C FilterType=FIXED
;alias_disk_d = CheckDriveSize MinWarn=10% MinCrit=5% Drive=D FilterType=FIXED

; alias_disk_loose - Alias for alias_disk_loose. To configure this item add a section called: /settings/external scripts/alias/alias_disk_loose
alias_disk_loose = CheckDriveSize MinWarn=10% MinCrit=5% CheckAll FilterType=FIXED ignore-unreadable

; alias_event_log - Alias for alias_event_log. To configure this item add a section called: /settings/external scripts/alias/alias_event_log
alias_event_log = CheckEventLog file=application file=system MaxWarn=1 MaxCrit=1 "filter=generated gt -2d AND severity NOT IN ('success', 'informational') AND source != 'SideBySide'" truncate=800 unique descriptions "syntax=%severity%: %source%: %message% (%count%)"

; alias_file_age - Alias for alias_file_age. To configure this item add a section called: /settings/external scripts/alias/alias_file_age
alias_file_age = checkFile2 filter=out "file=$ARG1$" filter-written=>1d MaxWarn=1 MaxCrit=1 "syntax=%filename% %write%"

; alias_file_size - Alias for alias_file_size. To configure this item add a section called: /settings/external scripts/alias/alias_file_size
alias_file_size = CheckFiles "filter=size > $ARG2$" "path=$ARG1$" MaxWarn=1 MaxCrit=1 "syntax=%filename% %size%" max-dir-depth=10

; alias_mem - Alias for alias_mem. To configure this item add a section called: /settings/external scripts/alias/alias_mem
;alias_mem = checkMem MaxWarn=80% MaxCrit=90% ShowAll=long type=physical type=virtual type=paged type=page
alias_mem = checkMem MaxWarn=80% MaxCrit=90% ShowAll=long type=physical

; alias_process - Alias for alias_process. To configure this item add a section called: /settings/external scripts/alias/alias_process
alias_process = checkProcState "$ARG1$=started"

; alias_process_count - Alias for alias_process_count. To configure this item add a section called: /settings/external scripts/alias/alias_process_count
alias_process_count = checkProcState MaxWarnCount=$ARG2$ MaxCritCount=$ARG3$ "$ARG1$=started"

; alias_process_hung - Alias for alias_process_hung. To configure this item add a section called: /settings/external scripts/alias/alias_process_hung
alias_process_hung = checkProcState MaxWarnCount=1 MaxCritCount=1 "$ARG1$=hung"

; alias_process_stopped - Alias for alias_process_stopped. To configure this item add a section called: /settings/external scripts/alias/alias_process_stopped
alias_process_stopped = checkProcState "$ARG1$=stopped"

; alias_sched_all - Alias for alias_sched_all. To configure this item add a section called: /settings/external scripts/alias/alias_sched_all
alias_sched_all = CheckTaskSched "filter=exit_code ne 0" "syntax=%title%: %exit_code%" warn=>0

; alias_sched_long - Alias for alias_sched_long. To configure this item add a section called: /settings/external scripts/alias/alias_sched_long
alias_sched_long = CheckTaskSched "filter=status = 'running' AND most_recent_run_time < -$ARG1$" "syntax=%title% (%most_recent_run_time%)" warn=>0

; alias_sched_task - Alias for alias_sched_task. To configure this item add a section called: /settings/external scripts/alias/alias_sched_task
alias_sched_task = CheckTaskSched "filter=title eq '$ARG1$' AND exit_code ne 0" "syntax=%title% (%most_recent_run_time%)" warn=>0

; alias_service - Alias for alias_service. To configure this item add a section called: /settings/external scripts/alias/alias_service
alias_service = checkServiceState CheckAll

; alias_service_ex - Alias for alias_service_ex. To configure this item add a section called: /settings/external scripts/alias/alias_service_ex
alias_service_ex = checkServiceState CheckAll "exclude=Net Driver HPZ12" "exclude=Pml Driver HPZ12" exclude=stisvc

; alias_up - Alias for alias_up. To configure this item add a section called: /settings/external scripts/alias/alias_up
alias_up = checkUpTime MinWarn=1d MinWarn=1h

; alias_updates - Alias for alias_updates. To configure this item add a section called: /settings/external scripts/alias/alias_updates
alias_updates = check_updates -warning 0 -critical 0

; alias_volumes - Alias for alias_volumes. To configure this item add a section called: /settings/external scripts/alias/alias_volumes
alias_volumes = CheckDriveSize MinWarn=10% MinCrit=5% CheckAll=volumes FilterType=FIXED

; alias_volumes_loose - Alias for alias_volumes_loose. To configure this item add a section called: /settings/external scripts/alias/alias_volumes_loose
alias_volumes_loose = CheckDriveSize MinWarn=10% MinCrit=5% CheckAll=volumes FilterType=FIXED ignore-unreadable

; default - Alias for default. To configure this item add a section called: /settings/external scripts/alias/default
default = 


; List all dot net modules loaded by the DotNetplugins module here
[/modules/dotnet]


; Section for SMTP passive check module.
[/settings/NRDP/client]

; CHANNEL - The channel to listen to.
channel = NRDP

; HOSTNAME - The host name of this host if set to blank (default) the windows name of the computer will be used.
hostname = auto


; Target definition for: default
[/settings/NRDP/client/targets/default]

; TARGET ADDRESS - Target host address
address = 

; RECIPIENT - Recipient of email message
recipient = nscp@localhost

; SENDER - Sender of email message
sender = nscp@localhost

; TEMPLATE - Template for message data
template = Hello, this is %source% reporting %message%!

; TIMEOUT - Timeout when reading/writing packets to/from sockets.
timeout = 30


; Section for NRPE active/passive check module.
[/settings/NRPE/client]

; CHANNEL - The channel to listen to.
channel = NRPE


; Target definition for: default
[/settings/NRPE/client/targets/default]

; TARGET ADDRESS - Target host address
address = 

; ALLOWED CIPHERS - A better value is: ALL:!ADH:!LOW:!EXP:!MD5:@STRENGTH
allowed ciphers = ADH

; SSL CERTIFICATE - 
certificate = 

; PAYLOAD LENGTH - Length of payload to/from the NRPE agent. This is a hard specific value so you have to "configure" (read recompile) your NRPE agent to use the same value for it to work.
payload length = 1024

; TIMEOUT - Timeout when reading/writing packets to/from sockets.
timeout = 30

; ENABLE SSL ENCRYPTION - This option controls if SSL should be enabled.
use ssl = true

; VERIFY MODE - 
verify mode = none


; Section for NRPE (NRPEServer.dll) (check_nrpe) protocol options.
[/settings/NRPE/server]

; COMMAND ARGUMENT PROCESSING - This option determines whether or not the we will allow clients to specify arguments to commands that are executed.
allow arguments = false

; COMMAND ALLOW NASTY META CHARS - This option determines whether or not the we will allow clients to specify nasty (as in |`&><'"\[]{}) characters in arguments.
allow nasty characters = false

; PORT NUMBER - Port to use for NRPE.
port = 5666


; Section for NSCA passive check module.
[/settings/NSCA/client]
delay=0

; CHANNEL - The channel to listen to.
channel = NSCA

; HOSTNAME - The host name of this host if set to blank (default) the windows name of the computer will be used.
hostname = auto
;hostname =  WA-MKOT2

; Target definition for: default
[/settings/NSCA/client/targets/default]

; TARGET ADDRESS - Target host address
address = 172.17.24.182
;address = 192.168.20.120

; ALLOWED CIPHERS - A better value is: ALL:!ADH:!LOW:!EXP:!MD5:@STRENGTH
;allowed ciphers = ADH

; SSL CERTIFICATE - 
;certificate = 

; ENCRYPTION METHOD - Number corresponding to the various encryption algorithms (see the wiki). Has to be the same as the server or it wont work at all.
encryption = aes
;encryption = xor

; PASSWORD - The password to use. Again has to be the same as the server or it wont work at all.
password = passwd

; Target server port
port=5667

; TIMEOUT - Timeout when reading/writing packets to/from sockets.
timeout = 30

; ENABLE SSL ENCRYPTION - This option controls if SSL should be enabled.
use ssl = false

; VERIFY MODE - 
verify mode = none


; Section for NSCA (NSCAServer) (check_nsca) protocol options.
[/settings/NSCA/server]

; ENCRYPTION - Encryption to use
encryption = aes
;encryption = xor

; PASSWORD - Password to use
password = passwd

; PAYLOAD LENGTH - Length of payload to/from the NSCA agent. This is a hard specific value so you have to "configure" (read recompile) your NSCA agent to use the same value for it to work.
payload length = 512

; PERFORMANCE DATA - Send performance data back to nagios (set this to 0 to remove all performance data).
performance data = true

; PORT NUMBER - Port to use for NSCA.
port = 5667

; ENABLE SSL ENCRYPTION - This option controls if SSL should be enabled.
use ssl = false


; Section for NSClient (NSClientServer.dll) (check_nt) protocol options.
[/settings/NSClient/server]

; PERFORMANCE DATA - Send performance data back to nagios (set this to 0 to remove all performance data).
performance data = true

; PORT NUMBER - Port to use for check_nt.
port = 12489


; Section for SMTP passive check module.
[/settings/SMTP/client]

; CHANNEL - The channel to listen to.
channel = SMTP


; Target definition for: default
[/settings/SMTP/client/targets/default]

; TARGET ADDRESS - Target host address
address = 

; RECIPIENT - Recipient of email message
recipient = nscp@localhost

; SENDER - Sender of email message
sender = nscp@localhost

; TEMPLATE - Template for message data
template = Hello, this is %source% reporting %message%!

; TIMEOUT - Timeout when reading/writing packets to/from sockets.
timeout = 30


; Section for simple cache module (SimpleCache.dll).
[/settings/cache]

; CHANNEL - The channel to listen to.
channel = CACHE

; PRIMARY CACHE INDEX - Set this to the value you want to use as unique key for the cache (host, command, result,...).
primary index = ${alias-or-command}


; Section for system checks and system settings
[/settings/check/task schedule]

; SYNTAX - Set this to use a specific syntax string for all commands (that don't specify one)
default buffer length = %title% last run: %most-recent-run-time% (%exit-code%)


; Configure crash handling properties.
[/settings/crash]

; ARCHIVE CRASHREPORTS - Archive crash reports in the archive folder
archive = true

; folder - The archive folder for crash dunpes.
archive folder = ${shared-path}/crash-dumps

; RESTART - Submit crash reports to nsclient.org (or your configured submission server)
restart = true

; RESTART SERVICE NAME - The url to submit crash reports to
restart target = NSClientpp

; SUBMIT CRASHREPORTS - Submit crash reports to nsclient.org (or your configured submission server)
submit = false

; SUBMISSION URL - The url to submit crash reports to
submit url = http://crash.nsclient.org/submit


; Section for the EventLog Checker (CheckEventLog.dll).
[/settings/eventlog]

; BUFFER_SIZE - The size of the buffer to use when getting messages this affects the speed and maximum size of messages you can recieve.
buffer size = 131072

; DEBUG - Log more information when filtering (usefull to detect issues with filters) not usefull in production as it is a bit of a resource hog.
debug = false

; LOOKUP NAMES - Lookup the names of eventlog files
lookup names = true

; SYNTAX - Set this to use a specific syntax string for all commands (that don't specify one).
syntax = 


; A set of options to configure the real time checks
[/settings/eventlog/real-time]

; DEBUG - Log missed records (usefull to detect issues with filters) not usefull in production as it is a bit of a resource hog.
debug = false

; REAL TIME CHECKING - Spawns a backgrounnd thread which detects issues and reports them back instantly.
enabled = false

; LOGS TO CHECK - Comma separated list of logs to check
log = application,system

; STARTUP AGE - The initial age to scan when starting NSClient++
startup age = 30m


; A set of filters to use in real-time mode
[/settings/eventlog/real-time/filters]


; Section for external scripts configuration options (CheckExternalScripts).
[/settings/external scripts]

; COMMAND ARGUMENT PROCESSING - This option determines whether or not the we will allow clients to specify arguments to commands that are executed.
allow arguments = false

; COMMAND ALLOW NASTY META CHARS - This option determines whether or not the we will allow clients to specify nasty (as in |`&><'"\[]{}) characters in arguments.
allow nasty characters = false

; SCRIPT DIRECTORY - Load all scripts in a directory and use them as commands. Probably dangerous but useful if you have loads of scripts :)
script path = 

; COMMAND TIMEOUT - The maximum time in seconds that a command can execute. (if more then this execution will be aborted). NOTICE this only affects external commands not internal ones.
timeout = 60


; A list of scripts available to run from the CheckExternalScripts module. Syntax is: <command>=<script> <arguments>
[/settings/external scripts/scripts]


; A list of wrappped scripts (ie. using the template mechanism)
[/settings/external scripts/wrapped scripts]


; A list of templates for wrapped scripts
[/settings/external scripts/wrappings]

; BATCH FILE WRAPPING - 
bat = scripts\\%SCRIPT% %ARGS%

; POWERSHELL WRAPPING - 
ps1 = cmd /c echo scripts\\%SCRIPT% %ARGS%; exit($lastexitcode) | powershell.exe -command -

; VISUAL BASIC WRAPPING - 
vbs = cscript.exe //T:30 //NoLogo scripts\\lib\\wrapper.vbs %SCRIPT% %ARGS%


; Section for graphite passive check module.
[/settings/graphite/client]

; CHANNEL - The channel to listen to.
channel = GRAPHITE

; HOSTNAME - The host name of this host if set to blank (default) the windows name of the computer will be used.
hostname = auto


; Target definition for: default
[/settings/graphite/client/targets/default]

; TARGET ADDRESS - Target host address
address = 

; PATH FOR VALUES - 
path = system.${hostname}.${check_alias}.${perf_alias}


; Section for configuring the log handling.
[/settings/log]

; 2014-03-04
debug=1

; DATEMASK - The size of the buffer to use when getting messages this affects the speed and maximum size of messages you can recieve.
date format = %Y-%m-%d %H:%M:%S

; FILENAME - The file to write log data to. Set this to none to disable log to file.
file name = ${exe-path}/nsclient.log


; Configure log file properties.
[/settings/log/file]

; MAXIMUM FILE SIZE - When file size reaches this it will be truncated to 50% if set to 0 (default) truncation will be disabled
max size = 0

; Path to log file - 2014-03-04
file=C:\NSC.log


; Section for log file checker
[/settings/logfile]

; DEBUG - Log more information to help diagnose errors and configuration problems.
debug = false

; SYNTAX - Set the default syntax to use
syntax = 


; A set of options to configure the real time checks
[/settings/logfile/real-time]

; REAL TIME CHECKING - Spawns a backgrounnd thread which waits for file changes.
enabled = false


; A set of filters to use in real-time mode
[/settings/logfile/real-time/checks]


; Section for the LUAScripts module.
[/settings/lua]


; A list of scripts available to run from the LuaSCript module.
[/settings/lua/scripts]


; Section for the Scheduler module.
[/settings/scheduler]

; THREAD COUNT - Number of threads to use.
threads = 5


; Section for the Scheduler module.

[/settings/scheduler/schedules/default]
;interval=5m
;interval=10s
channel=NSCA
interval=10m
report=all

; dodane 20140228

[/settings/scheduler/schedules]
CPU Load=alias_cpu
MEM Load=alias_mem
;DISK Load=alias_disk
DISK Load C=alias_disk_c
;DISK Load D=alias_disk_d
;service=alias_service

; Section for configuring the shared session.
[/settings/shared session]

; LOG LEVEL - Log level to use
enabled = false


; Section for SYSLOG passive check module.
[/settings/syslog/client]

; CHANNEL - The channel to listen to.
channel = syslog

; HOSTNAME - The host name of this host if set to blank (default) the windows name of the computer will be used.
hostname = auto


; Target definition for: default
[/settings/syslog/client/targets/default]

; TARGET ADDRESS - Target host address
address = 

; TODO - 
critical severity = critical

; TODO - 
facility = kernel

; TODO - 
message_syntax = %message%

; TODO - 
ok severity = informational

; TODO - 
severity = error

; TODO - 
tag_syntax = NSCA

; TODO - 
unknown severity = emergency

; TODO - 
warning severity = warning


; Section for system checks and system settings
[/settings/system/windows]

; DEFAULT LENGTH - Used to define the default intervall for range buffer checks (ie. CPU).
default buffer length = 1h


; Confiure which services has to be in which state
[/settings/system/windows/service mapping]


; A list of avalible remote target systems
[/settings/targets]


; Section for simple file writer module (SimpleFileWriter.dll).
[/settings/writers/file]

; CHANNEL - The channel to listen to.
channel = FILE

; FILE TO WRITE TO - The filename to write output to.
file = output.txt

; PRIMARY CACHE INDEX - Set this to the value you want to use as unique key for the cache (host, command, result,...).
syntax = ${alias-or-command} ${result} ${message}


And host/service definition

Code: Select all

define host{
	use			passive-tpl	; Inherit default values from a template
	host_name		PN-ACIESIELSKI1	; The name we're giving to this host
	alias			PN-ACIESIELSKI1	; A longer name associated with the host
	address			192.168.1.39	; IP address of the host
	icon_image		workstation.png
	statusmap_image		workstation.gd2
	parents			AC VoIP
	}

define hostgroup{
	hostgroup_name	TAIPN 25/11
	alias		Komputery TAIPN w biurze 25/11
	members 	PN-ACIESIELSKI1, <other_computers>
	}

define service{
	use			generic-service,nagiosgraph
	#host_name	
	hostgroup_name		TAIPN 25/11
	service_description	CPU Load
	#check_command		check_nt!CPULOAD!-l 5,80,90
	check_command		check_nrpe!CPU Load
	notification_interval	30
	notification_options	w,c
	flap_detection_enabled	0
	}



# Create a service for monitoring memory usage
# Change the host_name to match the name of the host you defined above

define service{
	use			generic-service,nagiosgraph
	#host_name		
	hostgroup_name		TAIPN 25/11
	service_description	MEM Load
	#check_command		check_nt!MEMUSE!-w 80 -c 90
	check_command		check_nrpe!MEM Load
	notification_interval	30
	notification_options	w,c
	flap_detection_enabled	0
	}



# Create a service for monitoring C:\ disk usage
# Change the host_name to match the name of the host you defined above

define service{
	use			generic-service,nagiosgraph
	#host_name		
	hostgroup_name		TAIPN 25/11
	service_description	DISK Load C
	#check_command		check_nt!USEDDISKSPACE!-l c -w 80 -c 90
	check_command		check_nrpe!DISK Load C
	notification_interval	60
	notification_options	w,c
	flap_detection_enabled	0
	}


Re: Issues with passive checking in Nagios

Posted: Tue Mar 11, 2014 12:39 am
by mkot
I configured NSCA via this procedure:
http://hyper-choi.blogspot.com/2012/07/ ... k-for.html

And, my passive-tpl Template:

Code: Select all

define host{
        name                            passive-tpl    	; Nazwa szablonu
	use				generic-host	;  
	check_period			24x7		; Okres sprawdzania hostów, np. 24x7, workhours
	check_interval			5		; Czas następnego sprawdzenia [min]
	retry_interval			1		; Jeśli są problemy z połączeniem do hosta, interwał monitorowania zmniejszamy
	max_check_attempts		10		; Maksymalna liczba prób monitorowania hosta
	check_command			check-host-alive; Sprawdzenie czy host jest włączony
	flap_detection_enabled          0       	; Flap detection is enabled
	notification_period		24x7		; Send host notifications at any time
	notification_interval		30		; odstep pomiędzy powiadomieniami [min]
	notification_options		d, r		; Powiadomienie gdy pojawi się gdy host będzie w stanie d (DOWN) ;ub r (OK)
	contact_groups			admins		; Osoby, do których zostanie wysłąne powiadomienie
	active_checks_enabled		0		; Aktywne sprawdzanie hostów - 0 (DISABLED), 1 (ENABLED)
	passive_checks_enabled		1		; Pasywne sprawdzanie hostów - 0 (DISABLED), 1 (ENABLED)
        register                        0       	; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
        }

I can see this constantly at more hosts. I've got about 100 hosts nagios is monitoring this way. About half of them is working fine, all of these hosts have the same configuration.

Code: Select all


tai@nagios:~$ tree /usr/local/nagios/etc/objects/
/usr/local/nagios/etc/objects/
├── commands.cfg
├── contacts.cfg 
├── localhost.cfg // I don't use it
├── printer.cfg // I don't use it
├── switch.cfg // I don't use it
├── taikr
│   ├── druk-kr.cfg
│   ├── linux-kr.cfg
│   ├── sw-kr.cfg
│   └── TAIKR.cfg
├── taild
│   ├── druk-ld.cfg
│   ├── linux-ld.cfg
│   ├── sw-ld.cfg
│   └── TAILD.cfg
├── taipn
│   ├── druk-pn.cfg
│   ├── linux-pn.cfg
│   ├── sw-pn.cfg
│   ├── TAIPN1.cfg
│   ├── TAIPN2.cfg
│   └── win-srv.cfg
├── taiwa
│   ├── druk-wa.cfg
│   ├── linux-wa.cfg
│   ├── sw-wa.cfg
│   ├── TAIWA.cfg
│   └── win-srv.cfg
├── templates.cfg
├── timeperiods.cfg
└── windows.cfg  // I don't use it

EDIT:
Before I made screenshoot CPU Load was OK, and DISK Load C wasn't.


EDIT2:
I think I solved this issue, but don't close this thread, please.

Ok, as you can see, im using two templates at service definition, firs is generic-service second is nagiosgraph. In generic-service template definition I'd got:

Code: Select all

# Generic service definition template - This is NOT a real service, just a template!

define service{
        name                            generic-service         ; The 'name' of this service template
        active_checks_enabled           1                       ; Active service checks are enabled
        passive_checks_enabled          1                       ; Passive service checks are enabled/accepted
        parallelize_check               1                       ; Active service checks should be parallelized (disabling this can lead to major performance problems)
        obsess_over_service             1                       ; We should obsess over this service (if necessary)
        check_freshness                 0                       ; Default is to NOT check service 'freshness'
        notifications_enabled           1                       ; Service notifications are enabled
        event_handler_enabled           1                       ; Service event handler is enabled
        flap_detection_enabled          1                       ; Flap detection is enabled
        process_perf_data               1                       ; Process performance data
        retain_status_information       1                       ; Retain status information across program restarts
        retain_nonstatus_information    1                       ; Retain non-status information across program restarts
        is_volatile                     0                       ; The service is not volatile
        check_period                    24x7                    ; The service can be checked at any time of the day
        max_check_attempts              3                       ; Re-check the service up to 3 times in order to determine its final (hard) state
        normal_check_interval           10                      ; Check the service every 10 minutes under normal conditions
        retry_check_interval            2                       ; Re-check the service every two minutes until a hard state can be determined
        contact_groups                  admins                  ; Notifications get sent out to everyone in the 'admins' group
        notification_options            w,u,c,r                 ; Send notifications about warning, unknown, critical, and recovery events
        notification_interval           60                      ; Re-notify about service problems every hour
        notification_period             24x7                    ; Notifications can be sent out at any time
         register                        0                      ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
        }
And I chaned

Code: Select all

active_checks_enabled           1                       ; Active service checks are enabled
to

Code: Select all

active_checks_enabled           0                       ; Active service checks are enabled
and restarted Nagios process.

Re: Issues with passive checking in Nagios

Posted: Tue Mar 11, 2014 10:41 am
by slansing
Okay, so it looks like a at least couple checks are working, any chance we could get a screenshot of the details on that CPU service that is failing? We need to see the full output of that error return.

Re: Issues with passive checking in Nagios

Posted: Wed Mar 12, 2014 2:31 am
by mkot
Hi, acctually that host is working fine now but I've gto some hosts with that error:

Code: Select all

CPU Load
Active checks of the service have been disabled - only passive checks are being accepted	Perform Extra Service Actions
CRITICAL	03-11-2014 09:04:42	 1d 11h 42m 1s	3/3	(No output on stdout) stderr: execvp(/usr/local/nagios/libexec/check_nrpe, ...) failed. errno is 2: No such file or directory 
DISK Load C
Active checks of the service have been disabled - only passive checks are being accepted	Perform Extra Service Actions
CRITICAL	03-11-2014 09:04:13	 3d 20h 40m 5s	3/3	(No output on stdout) stderr: execvp(/usr/local/nagios/libexec/check_nrpe, ...) failed. errno is 2: No such file or directory 
MEM Load
Active checks of the service have been disabled - only passive checks are being accepted	Perform Extra Service Actions
CRITICAL	03-11-2014 09:04:05	 1d 11h 42m 39s	3/3	(No output on stdout) stderr: execvp(/usr/local/nagios/libexec/check_nrpe, ...) failed. errno is 2: No such file or directory 
maybe they'll show today as OK ;) I'll wait and if nothing happen I'll send you log from them.

But, I've got one host (I don't know if there is more), which in nsclient.log has:

Code: Select all

2014-03-11 15:16:34: e:..\..\..\..\trunk\modules\CheckSystem\PDHCollector.cpp:139: Failed to query performance counters: PdhCollectQueryData failed: : -2147481643: Nie ma żadnych danych do zwrócenia.


2014-03-11 15:16:34: e:..\..\..\..\trunk\modules\CheckSystem\PDHCollector.cpp:139: Failed to query performance counters: PdhCollectQueryData failed: : -2147481643: Nie ma żadnych danych do zwrócenia.


2014-03-11 15:16:34: e:..\..\..\..\trunk\modules\CheckSystem\PDHCollector.cpp:139: Failed to query performance counters: PdhCollectQueryData failed: : -2147481643: Nie ma żadnych danych do zwrócenia.


2014-03-11 15:16:35: e:..\..\..\..\trunk\modules\CheckSystem\PDHCollector.cpp:139: Failed to query performance counters: PdhCollectQueryData failed: : -2147481643: Nie ma żadnych danych do zwrócenia.


2014-03-11 15:16:35: e:..\..\..\..\trunk\modules\CheckSystem\PDHCollector.cpp:139: Failed to query performance counters: PdhCollectQueryData failed: : -2147481643: Nie ma żadnych danych do zwrócenia.
It doesn't show acctuall usage of CPU/MEM/HDD at Nagios Console (screen). It shows 'test' status which I added it manually.

Logs from nagios:

Code: Select all

root@nagios:~# less /var/log/syslog |grep KR-MCICHON
Mar 12 06:53:52 nagios nagios: wproc:   host=KR-MCICHON; service=(null);
Mar 12 06:53:52 nagios nagios: Warning: Check of host 'KR-MCICHON' timed out after 30.00 seconds
Mar 12 06:59:22 nagios nagios: wproc:   host=KR-MCICHON; service=(null);
Mar 12 06:59:22 nagios nagios: Warning: Check of host 'KR-MCICHON' timed out after 30.01 seconds
Mar 12 07:04:52 nagios nagios: wproc:   host=KR-MCICHON; service=(null);
Mar 12 07:04:52 nagios nagios: Warning: Check of host 'KR-MCICHON' timed out after 30.01 seconds
Mar 12 07:10:22 nagios nagios: wproc:   host=KR-MCICHON; service=(null);
Mar 12 07:10:22 nagios nagios: Warning: Check of host 'KR-MCICHON' timed out after 30.01 seconds
Mar 12 07:15:26 nagios nagios: HOST ALERT: KR-MCICHON;UP;HARD;1;PING OK - Packet loss = 0%, RTA = 160.68 ms
root@nagios:~#

Code: Select all

root@nagios:~# less /usr/local/nagios/var/nagios.log |perl -pe 's/(\d+)/localtime($1)/e' > /tmp/log
root@nagios:~# less /tmp/log |grep KR-MCICHON
[Wed Mar 12 00:00:00 2014] CURRENT HOST STATE: KR-MCICHON;DOWN;HARD;1;PING CRITICAL - Packet loss = 100%
[Wed Mar 12 00:00:00 2014] CURRENT SERVICE STATE: KR-MCICHON;CPU Load;OK;HARD;1;test
[Wed Mar 12 00:00:00 2014] CURRENT SERVICE STATE: KR-MCICHON;DISK Load C;OK;HARD;1;test
[Wed Mar 12 00:00:00 2014] CURRENT SERVICE STATE: KR-MCICHON;MEM Load;OK;HARD;1;test
[Wed Mar 12 00:12:22 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 00:12:22 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.00 seconds
[Wed Mar 12 00:17:52 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 00:17:52 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.01 seconds
[Wed Mar 12 00:28:52 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 00:28:52 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.01 seconds
[Wed Mar 12 00:34:22 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 00:34:22 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.01 seconds
[Wed Mar 12 00:39:52 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 00:39:52 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.01 seconds
[Wed Mar 12 01:12:52 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 01:12:52 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.00 seconds
[Wed Mar 12 01:18:22 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 01:18:22 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.01 seconds
[Wed Mar 12 01:23:52 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 01:23:52 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.00 seconds
[Wed Mar 12 01:40:22 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 01:40:22 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.00 seconds
[Wed Mar 12 01:45:52 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 01:45:52 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.01 seconds
[Wed Mar 12 02:02:22 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 02:02:22 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.01 seconds
[Wed Mar 12 02:07:52 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 02:07:52 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.00 seconds
[Wed Mar 12 02:13:22 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 02:13:22 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.01 seconds
[Wed Mar 12 02:18:52 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 02:18:52 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.01 seconds
[Wed Mar 12 02:24:22 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 02:24:22 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.01 seconds
[Wed Mar 12 02:57:22 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 02:57:22 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.00 seconds
[Wed Mar 12 03:08:22 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 03:08:22 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.01 seconds
[Wed Mar 12 03:13:52 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 03:13:52 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.01 seconds
[Wed Mar 12 03:30:22 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 03:30:22 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.01 seconds
[Wed Mar 12 03:35:52 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 03:35:52 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.00 seconds
[Wed Mar 12 03:41:22 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 03:41:22 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.00 seconds
[Wed Mar 12 03:46:52 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 03:46:52 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.01 seconds
[Wed Mar 12 03:52:22 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 03:52:22 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.01 seconds
[Wed Mar 12 03:57:52 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 03:57:52 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.01 seconds
[Wed Mar 12 04:03:22 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 04:03:22 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.00 seconds
[Wed Mar 12 04:14:22 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 04:14:22 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.01 seconds
[Wed Mar 12 04:19:52 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 04:19:52 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.01 seconds
[Wed Mar 12 04:25:22 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 04:25:22 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.00 seconds
[Wed Mar 12 04:30:52 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 04:30:52 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.00 seconds
[Wed Mar 12 04:41:52 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 04:41:52 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.00 seconds
[Wed Mar 12 04:47:22 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 04:47:22 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.01 seconds
[Wed Mar 12 04:52:52 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 04:52:52 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.01 seconds
[Wed Mar 12 04:58:22 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 04:58:22 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.01 seconds
[Wed Mar 12 05:03:52 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 05:03:52 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.00 seconds
[Wed Mar 12 05:09:22 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 05:09:22 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.00 seconds
[Wed Mar 12 05:14:52 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 05:14:52 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.00 seconds
[Wed Mar 12 05:25:52 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 05:25:52 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.01 seconds
[Wed Mar 12 05:36:52 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 05:36:52 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.01 seconds
[Wed Mar 12 05:42:22 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 05:42:22 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.00 seconds
[Wed Mar 12 05:47:52 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 05:47:52 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.01 seconds
[Wed Mar 12 05:58:52 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 05:58:52 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.00 seconds
[Wed Mar 12 06:04:22 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 06:04:22 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.00 seconds
[Wed Mar 12 06:15:22 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 06:15:22 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.01 seconds
[Wed Mar 12 06:20:52 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 06:20:52 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.01 seconds
[Wed Mar 12 06:26:22 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 06:26:22 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.01 seconds
[Wed Mar 12 06:31:52 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 06:31:52 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.01 seconds
[Wed Mar 12 06:37:22 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 06:37:22 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.01 seconds
[Wed Mar 12 06:48:22 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 06:48:22 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.01 seconds
[Wed Mar 12 06:53:52 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 06:53:52 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.00 seconds
[Wed Mar 12 06:59:22 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 06:59:22 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.01 seconds
[Wed Mar 12 07:04:52 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 07:04:52 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.01 seconds
[Wed Mar 12 07:10:22 2014] wproc:   host=KR-MCICHON; service=(null);
[Wed Mar 12 07:10:22 2014] Warning: Check of host 'KR-MCICHON' timed out after 30.01 seconds
[Wed Mar 12 07:15:26 2014] HOST ALERT: KR-MCICHON;UP;HARD;1;PING OK - Packet loss = 0%, RTA = 160.68 ms
root@nagios:~#


Re: Issues with passive checking in Nagios

Posted: Wed Mar 12, 2014 4:16 pm
by slansing
The counter based error you are seeing form the nsclient log is fixable by either rebuilding your performance counters, or upgrading to version 4.0.2+ of NSClient++:

http://nsclient.org/nscp/ticket/663

Of course, you could also go with using NCPA, and NRDP based checks that use plugins and do not rely so much on static, hard to change modules.

Re: Issues with passive checking in Nagios

Posted: Thu Mar 13, 2014 2:37 am
by mkot
How can I rebuild my performance counters? By 4.0.2+ version you mean Nagios version or Nagios' Agent (nsclient++) version?

I'm using:
Server: Nagios Core 4.0.3
Hosts/Agents:

Code: Select all

Nsclient++ 0.4.90 x32 at Vista/7/8 hosts

Code: Select all

Nsclient++ 0.3.9  x32 at 2k/2k3/XP hosts
I tried using newest wersion (it was 0.4.90 when I installed Nagios Core) ar all hosts, but it get crasher on XP so I tried older version of agent and 0.3.9 works fine.

EDIT:
Some of these errors were caused by the same IP for a few hosts (my mistake ;) ), some errors gone when I put on hosts new INI files. Tomorow I'll know if problems're gone or not.

EDIT2:
Where can I disable or hange default value of max_packet_age? Some hosts have wrong time and I get:

Code: Select all

Mar 13 12:58:09 nagios nsca[16588]: Dropping packet with future timestamp.
I want to disable it. I found there is possibiliti to do this. http://sourceforge.net/p/nagios/nsca/ci ... /Changelog

Re: Issues with passive checking in Nagios

Posted: Thu Mar 13, 2014 11:36 am
by sreinhardt
The timestamp issue is actually to do with your nagios box likely being ahead of the windows system. There is not much that can or should be done, short of setting up ntp and making sure your clocks are properly synced.

Re: Issues with passive checking in Nagios

Posted: Fri Mar 14, 2014 9:27 am
by mkot
Oh, I see. So only solution for these issues (timestamp) is to force sync time at hosts to be the same as Nagios' Server. Thanks. I think you can close this thread. If I get new errors I'll search forum for how to fix it, and if i don't find anything helpfull, I'll open new topic.