Service Monitoring Setup

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
nagwindmon
Posts: 92
Joined: Mon Dec 01, 2014 3:39 pm

Service Monitoring Setup

Post by nagwindmon »

hello team,
I just wanted to clarify a few things:
When I'm adding new Service which Initial state do I need to select since it looks like an optional? for example Service will be checking if particular port is open on the host but it could be down the first time check is applied.I setup to check every 60 min and if it fails check 5 more times in 15 minutes interval, if it still down, trap will be send out. so what effect initial state will have? In my test it seems check is sitting on "Service check is pending..."
Also I use Time Periods from 10:00 AM to 17:00 PM, so if service check fails at 17:00, would it continue checking for the next hour?

Thanks!
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Service Monitoring Setup

Post by scottwilkerson »

nagwindmon wrote:When I'm adding new Service which Initial state do I need to select since it looks like an optional?
Initial state is optional and it only affects what state the service will display before it is ever checked the first time.

By default it will just say "Service check is pending..." and be in an unknown state until the first check.

This feature is useful when you setup a service and it is not going to be actively checked for a while and you want it to be marked as OK until it is actually checked
nagwindmon wrote:Also I use Time Periods from 10:00 AM to 17:00 PM, so if service check fails at 17:00, would it continue checking for the next hour?
If 10:00-17:00 is your timeperiod, there will be no active checks performed outside those hours. and it will remain in the state it was in when it was last checked until 10:00 when checks resume.
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
nagwindmon
Posts: 92
Joined: Mon Dec 01, 2014 3:39 pm

Re: Service Monitoring Setup

Post by nagwindmon »

ok, one more condition to verify: I do have this marked is 1 so it will not alert on host down:
host_down_disable_service_checks=1
but what I notice that was host was up, service check failed, going through all reties and alert went out, then host went down, and brought it back up but port remained closed so how restart checks again if its state has not changed?
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Service Monitoring Setup

Post by scottwilkerson »

You need the timeperiod to check around the clock if you want the check to re-run
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
nagwindmon
Posts: 92
Joined: Mon Dec 01, 2014 3:39 pm

Re: Service Monitoring Setup

Post by nagwindmon »

got it, one another related question: what these option are, how do they correlate to Service checks?
You do not have the required permissions to view the files attached to this post.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Service Monitoring Setup

Post by scottwilkerson »

The top pic looks line something from our Trap Sender component. They correlate to the state, for services (OK, CRITICAL, WARNING) and STATE TYPE (HARD or SOFT)

Where you see "Changing" in the lower pic, that means SOFT.
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
nagwindmon
Posts: 92
Joined: Mon Dec 01, 2014 3:39 pm

Re: Service Monitoring Setup

Post by nagwindmon »

thanks Scott!
what is default setting for timeout utilizing check_tcp? I may have to increase it [-t <timeout seconds>] or I can also modify config file and change service_check_timeout, right?
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Service Monitoring Setup

Post by scottwilkerson »

looks like 10 sec, can be changed with the -t flag

Code: Select all

# /usr/local/nagios/libexec/check_tcp -h
check_tcp v2.2.1 (nagios-plugins 2.2.1)
Copyright (c) 1999 Ethan Galstad <[email protected]>
Copyright (c) 1999-2014 Nagios Plugin Development Team
        <[email protected]>

This plugin tests TCP connections with the specified host (or unix socket).

Usage:
check_tcp -H host -p port [-w <warning time>] [-c <critical time>] [-s <send string>]
[-e <expect string>] [-q <quit string>][-m <maximum bytes>] [-d <delay>]
[-t <timeout seconds>] [-r <refuse state>] [-M <mismatch state>] [-v] [-4|-6] [-j]
[-D <warn days cert expire>[,<crit days cert expire>]] [-S <use SSL>] [-E]
[-N <server name indication>]

Options:
 -h, --help
    Print detailed help screen
 -V, --version
    Print version information
 --extra-opts=[section][@file]
    Read options from an ini file. See
    https://www.nagios-plugins.org/doc/extra-opts.html
    for usage and examples.
 -H, --hostname=ADDRESS
    Host name, IP Address, or unix socket (must be an absolute path)
 -p, --port=INTEGER
    Port number (default: none)
 -4, --use-ipv4
    Use IPv4 connection
 -6, --use-ipv6
    Use IPv6 connection
 -E, --escape
    Can use \n, \r, \t or \\ in send or quit string. Must come before send or quit option
    Default: nothing added to send, \r\n added to end of quit
 -s, --send=STRING
    String to send to the server
 -e, --expect=STRING
    String to expect in server response (may be repeated)
 -A, --all
    All expect strings need to occur in server response. Default is any
 -q, --quit=STRING
    String to send server to initiate a clean close of the connection
 -r, --refuse=ok|warn|crit
    Accept TCP refusals with states ok, warn, crit (default: crit)
 -M, --mismatch=ok|warn|crit
    Accept expected string mismatches with states ok, warn, crit (default: warn)
 -j, --jail
    Hide output from TCP socket
 -m, --maxbytes=INTEGER
    Close connection once more than this number of bytes are received
 -d, --delay=INTEGER
    Seconds to wait between sending string and polling for response
 -D, --certificate=INTEGER[,INTEGER]
    Minimum number of days a certificate has to be valid.
    1st is #days for warning, 2nd is critical (if not specified - 0).
 -S, --ssl
    Use SSL for the connection.
 -w, --warning=DOUBLE
    Response time to result in warning status (seconds)
 -c, --critical=DOUBLE
    Response time to result in critical status (seconds)
 -t, --timeout=INTEGER:<timeout state>
    Seconds before connection times out (default: 10)
    Optional ":<timeout state>" can be a state integer (0,1,2,3) or a state STRING
 -v, --verbose
    Show details for command-line debugging (Nagios may truncate output)

Send email to [email protected] if you have questions regarding use
of this software. To submit patches or suggest improvements, send email to
[email protected]

Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
nagwindmon
Posts: 92
Joined: Mon Dec 01, 2014 3:39 pm

Re: Service Monitoring Setup

Post by nagwindmon »

ok, what about service_check_timeout what is it apply to?
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Service Monitoring Setup

Post by scottwilkerson »

nagwindmon wrote:ok, what about service_check_timeout what is it apply to?
That option in the nagios.cfg is the max nagios will allow a plugin to run, so if you have

Code: Select all

service_check_timeout=60
and try to set the -t flag on this plugin to 120 it will still timeout at 60.01seconds
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Locked