ping threshold settings?
ping threshold settings?
Hello, I'm new to nagios, but we just recently started getting notifications for one of our ESXi hosts (and a few guest vm's on it) being down and up, but they aren't actually experiencing any issues. Is there a way to increase this threshold. I was looking on the actual hosts and didn't see any config files, is there another location I should be looking at? Also, what's a good recommended threshold for this? I'm assuming it's probably still set to the defaults.
Thanks!
***** Nagios *****
Notification Type: RECOVERY
Host: badserver
State: UP
Address: xxx.xxx.xxx.xx
Info: PING OK - Packet loss = 0%, RTA = 0.50 ms
Date/Time: Wed Jul 22 07:08:27 MDT 2015
Thanks!
***** Nagios *****
Notification Type: RECOVERY
Host: badserver
State: UP
Address: xxx.xxx.xxx.xx
Info: PING OK - Packet loss = 0%, RTA = 0.50 ms
Date/Time: Wed Jul 22 07:08:27 MDT 2015
-
- Skynet Drone
- Posts: 2620
- Joined: Wed Feb 11, 2015 1:56 pm
Re: ping threshold settings?
There are a couple of things at play here, your command definition and your host definition. Since we can't see either I'll just show you a couple of examples:
Host:
The important part here being "check_command" - now we'll review the associated command:
If I wanted to change the thresholds for that host I'd adjust the "3000.0,80%" for the warning threshold and the "5000.0,100%" for the critical threshold. You might have arguments there in which case the adjustment would be made after an "!" in the earlier spoken of "check_command" line in the host config.
Host:
Code: Select all
define host {
host_name localhost
check_command check-host-alive
alias localhost
address 127.0.0.1
register 1
...
}
Code: Select all
define command {
command_name check-host-alive
command_line $USER1$/check_icmp -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5
}
Re: ping threshold settings?
Hey, thanks for the fast response. So I was looking at the cfg file on the nagios server and it only has
define host {
use linux-server
host_name noisyhost
address xxx.xxx.xxx.xx
parents parent_server
}
There's also a linux-server template that it is apparently using (see below)...so is this where I would make the change? This would affect all servers using this template and not just the noisy server so is there another cfg file that I need to edit?
###############################################################################
###############################################################################
#
# HOST DEFINITION
#
###############################################################################
###############################################################################
# Define a host for the remote machine
define host{
use linux-server
host_name servername
alias servername
address xxx.xxx.xxx.xx
}
###############################################################################
###############################################################################
#
# SERVICE DEFINITIONS
#
###############################################################################
###############################################################################
# Define a service to "ping" the remote machine
define service{
use generic-service ; Name of service template to use
host_name servername
service_description PING
check_command check_ping!100.0,20%!500.0,60%
}
# Define a service to check SSH on the remote machine.
# Disable notifications for this service by default, as not all users may have SSH enabled.
define service{
use generic-service ; Name of service template to use
host_name servername
service_description SSH
check_command check_ssh
}
#define service{
# use local-service,srv-pnp
# host_name servername
# service_description Net
# check_command check_snmp_int!public!eth0!4000,100!7000,200
# notifications_enabled 0
# }
# Define a service to check the disk space of the root partition
# on the remote machine. Warning if < 20% free, critical if
# < 10% free space on partition.
#define service{
# use generic-service,srv-pnp ; Name of service template to use
# host_name servername
# service_description Disk Partitions
# check_command check_nrpe!check_disk
# }
# Define a service to check the number of currently logged in
# users on the remote machine. Warning if > 20 users, critical
# if > 50 users.
#define service{
# use generic-service ; Name of service template to use
# host_name servername
# service_description Current Users
# check_command check_nrpe!check_users
# }
# Define a service to check the number of currently running procs
# on the remote machine. Warning if > 250 processes, critical if
# > 400 users.
#define service{
# use generic-service ; Name of service template to use
# host_name servername
# service_description Total Processes
# check_command check_nrpe!check_total_procs
# }
# Define a service to check the load on the remote machine.
#define service{
# use generic-service ; Name of service template to use
# host_name servername
# service_description Current Load
# check_command check_nrpe!check_load
# }
define host {
use linux-server
host_name noisyhost
address xxx.xxx.xxx.xx
parents parent_server
}
There's also a linux-server template that it is apparently using (see below)...so is this where I would make the change? This would affect all servers using this template and not just the noisy server so is there another cfg file that I need to edit?
###############################################################################
###############################################################################
#
# HOST DEFINITION
#
###############################################################################
###############################################################################
# Define a host for the remote machine
define host{
use linux-server
host_name servername
alias servername
address xxx.xxx.xxx.xx
}
###############################################################################
###############################################################################
#
# SERVICE DEFINITIONS
#
###############################################################################
###############################################################################
# Define a service to "ping" the remote machine
define service{
use generic-service ; Name of service template to use
host_name servername
service_description PING
check_command check_ping!100.0,20%!500.0,60%
}
# Define a service to check SSH on the remote machine.
# Disable notifications for this service by default, as not all users may have SSH enabled.
define service{
use generic-service ; Name of service template to use
host_name servername
service_description SSH
check_command check_ssh
}
#define service{
# use local-service,srv-pnp
# host_name servername
# service_description Net
# check_command check_snmp_int!public!eth0!4000,100!7000,200
# notifications_enabled 0
# }
# Define a service to check the disk space of the root partition
# on the remote machine. Warning if < 20% free, critical if
# < 10% free space on partition.
#define service{
# use generic-service,srv-pnp ; Name of service template to use
# host_name servername
# service_description Disk Partitions
# check_command check_nrpe!check_disk
# }
# Define a service to check the number of currently logged in
# users on the remote machine. Warning if > 20 users, critical
# if > 50 users.
#define service{
# use generic-service ; Name of service template to use
# host_name servername
# service_description Current Users
# check_command check_nrpe!check_users
# }
# Define a service to check the number of currently running procs
# on the remote machine. Warning if > 250 processes, critical if
# > 400 users.
#define service{
# use generic-service ; Name of service template to use
# host_name servername
# service_description Total Processes
# check_command check_nrpe!check_total_procs
# }
# Define a service to check the load on the remote machine.
#define service{
# use generic-service ; Name of service template to use
# host_name servername
# service_description Current Load
# check_command check_nrpe!check_load
# }
-
- Skynet Drone
- Posts: 2620
- Joined: Wed Feb 11, 2015 1:56 pm
Re: ping threshold settings?
You can override anything in a template by simply adding the same object definition to the host itself. That said - that's probably not what you need. Honestly yours is probably configured similar to mine. What you'll need in order to specify different thresholds for that host would be to create a new checkcommand with the thresholds defined differently, or maybe a better solution would be as follows:
Change your check-host-alive command to something like this (assuming that is what you're linux-server template is using):
Then adjust the check_command in your linux server template to read like this:
That will update all existing hosts using the template to make proper use of the new check command
And lastly, add the following line to your host definition for the loud hosts:
This will override the check_command for the hosts with crappy connectivity.
Change your check-host-alive command to something like this (assuming that is what you're linux-server template is using):
Code: Select all
define command {
command_name check-host-alive
command_line $USER1$/check_icmp -H $HOSTADDRESS$ $ARG1$
}
Code: Select all
check_command check-host-alive!-w 100.0,20% -c 500.0,60% -p 5
And lastly, add the following line to your host definition for the loud hosts:
Code: Select all
check_command check-host-alive!-w 1000.0,80% -c 5000.0,100% -p 5
Re: ping threshold settings?
Hello and thanks again for your help and fast responses. So just so I don't break anything, the first two snippets go into the linuxserver template, but does it matter what section I put them in? Then I just append the third snippet to the host.cfg file correct? Thanks again!
Re: ping threshold settings?
The first snippet:
The second:
The third one:
Hope this helps.
should go to the commands.cfg.define command {
command_name check-host-alive
command_line $USER1$/check_icmp -H $HOSTADDRESS$ $ARG1$
}
The second:
should to to the templates.cfg (or wherever you defined the linuxserver template).check_command check-host-alive!-w 100.0,20% -c 500.0,60% -p 5
The third one:
should go to the host.cfg.check_command check-host-alive!-w 1000.0,80% -c 5000.0,100% -p 5
Hope this helps.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: ping threshold settings?
hello again,
In looking around in the nagios directory I don't see a commands.cfg file. Is this a standard file for all nagios installs (we're running nagios core v 4.0.8).
Thanks again for the clarification!
In looking around in the nagios directory I don't see a commands.cfg file. Is this a standard file for all nagios installs (we're running nagios core v 4.0.8).
Thanks again for the clarification!
-
- Skynet Drone
- Posts: 2620
- Joined: Wed Feb 11, 2015 1:56 pm
Re: ping threshold settings?
The easiest way to find where your commands are defined would probably be to do run the following command from where your nagios.cfg is located:
If that's not helpful, look in your nagios.cfg for lines that start with cfg_:
That will list where ALL of your config files are.
Code: Select all
grep -Ri "define command" ./*
Code: Select all
grep ^cfg_ /usr/local/nagios/etc/nagios.cfg
Re: ping threshold settings?
Hello again and thanks for your help. I did find the commands.cfg file and it does already have the check-host-alive setting. I also added it to the appropriate template file with the settings you listed. On the host.cfg files, I added it to the end:
define host {
use linux-server
host_name noisyhost
address xxx.xxx.xxx.xx
parents parentserver
}
check_command check-host-alive!-w 1000.0,80% -c 5000.0,100% -p 5
However even after these changes I am still getting notifications, should I bump it up even more? I was also looking at the nagios.cfg file and noticed the setting for flap detection is off. Would changing that help this any or am I misunderstanding that feature?
Thanks again for your help!
define host {
use linux-server
host_name noisyhost
address xxx.xxx.xxx.xx
parents parentserver
}
check_command check-host-alive!-w 1000.0,80% -c 5000.0,100% -p 5
However even after these changes I am still getting notifications, should I bump it up even more? I was also looking at the nagios.cfg file and noticed the setting for flap detection is off. Would changing that help this any or am I misunderstanding that feature?
Thanks again for your help!
- Box293
- Too Basu
- Posts: 5126
- Joined: Sun Feb 07, 2010 10:55 pm
- Location: Deniliquin, Australia
- Contact:
Re: ping threshold settings?
When there is a problem, can you go to the host in core that is reporting the problem, and paste here the "Performance Data" string, it will look something like:
Code: Select all
rta=0.638000ms;3000.000000;5000.000000;0.000000 pl=0%;80;100;0
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.