Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
peterg
Posts: 12 Joined: Wed Oct 07, 2015 11:13 am
Post
by peterg » Thu Oct 08, 2015 5:03 am
Got a strange problem with a stock installation of Nagios with NRPE for remote monitoring, all the packages are from Ubuntu repositories. It's Ubuntu 14.04.3 LTS on the server side with the 12.04.5 LTS clients.
If I run the check on the server as my user id, nagios or root it works perfectly:
Code: Select all
ubuntu@nagios01:~$ /usr/lib/nagios/plugins/check_nrpe -H 10.99.1.6 -c check_users
USERS OK - 1 users currently logged in |users=1;5;10;0
However, in Nagios the service status is 'WARNING' and the Status Information field is '(null)' for all my nrpe checks. Localhost checks, ping and SSH remote checks are fine but they don't use nrpe. The log file doesn't record anything for the nrpe checks.
Any clues how I can debug this further?
Pete
hsmith
Agent Smith
Posts: 3539 Joined: Thu Jul 30, 2015 11:09 am
Location: 127.0.0.1
Contact:
Post
by hsmith » Thu Oct 08, 2015 12:45 pm
Can you post some of the service definitions that are having issues?
You can find these in /usr/local/nagios/etc/objects in most installs.
Former Nagios Employee.
me.
peterg
Posts: 12 Joined: Wed Oct 07, 2015 11:13 am
Post
by peterg » Fri Oct 09, 2015 4:54 am
Sure, this is from /etc/nagios/nrpe_local.cfg
Code: Select all
$ more /etc/nagios/nrpe_local.cfg
######################################
# Do any local nrpe configuration here
######################################
command[check_users]=/usr/lib/nagios/plugins/check_users -w 5 -c 10
command[check_load]=/usr/lib/nagios/plugins/check_load -w 15,10,5 -c 30,25,20
command[check_all_disks]=/usr/lib/nagios/plugins/check_disk -w 20% -c 10%
command[check_zombie_procs]=/usr/lib/nagios/plugins/check_procs -w 5 -c 10 -s Z
command[check_total_procs]=/usr/lib/nagios/plugins/check_procs -w 150 -c 200
command[check_swap]=/usr/lib/nagios/plugins/check_swap -w 20 -c 10
Is that what you meant?
Thanks for your help, Pete
hsmith
Agent Smith
Posts: 3539 Joined: Thu Jul 30, 2015 11:09 am
Location: 127.0.0.1
Contact:
Post
by hsmith » Fri Oct 09, 2015 9:48 am
I'm looking for the service checks that you set up inside of Nagios.
/usr/local/nagios/etc/objects should have some files in there that you configured for service checks.
For instance, I have some services defined inside of /usr/local/nagios/etc/objects/services.cfg on my core machine.
Former Nagios Employee.
me.
peterg
Posts: 12 Joined: Wed Oct 07, 2015 11:13 am
Post
by peterg » Mon Oct 12, 2015 6:36 am
I checked all the nagios packages on the server, the only package that contains objects was nagios3-common:
Code: Select all
$ dpkg -L nagios3-common | grep object
/usr/share/doc/nagios3-common/examples/template-object
/usr/share/doc/nagios3-common/examples/template-object/README
/usr/share/doc/nagios3-common/examples/template-object/localhost.cfg.gz
/usr/share/doc/nagios3-common/examples/template-object/timeperiods.cfg
/usr/share/doc/nagios3-common/examples/template-object/commands.cfg
/usr/share/doc/nagios3-common/examples/template-object/printer.cfg
/usr/share/doc/nagios3-common/examples/template-object/templates.cfg.gz
/usr/share/doc/nagios3-common/examples/template-object/windows.cfg
/usr/share/doc/nagios3-common/examples/template-object/switch.cfg
/usr/share/doc/nagios3-common/examples/template-object/contacts.cfg
... where they are just templates.
On the client, there's no objects.
What are they and could the fact they're missing be the root of the problem?
FYI, just had a thought that apparmor might be a cause, disabled it, no difference.
Pete
hsmith
Agent Smith
Posts: 3539 Joined: Thu Jul 30, 2015 11:09 am
Location: 127.0.0.1
Contact:
Post
by hsmith » Mon Oct 12, 2015 12:31 pm
Let's not worry about objects for now. I just need the command/service that you set up for one of your check_nrpe checks that is failing. It will look something like this:
Code: Select all
define service{
host_name linux-server
service_description check-disk-sda1
check_command check-disk!/dev/sda1
max_check_attempts 5
check_interval 5
retry_interval 3
check_period 24x7
notification_interval 30
notification_period 24x7
notification_options w,c,r
contact_groups linux-admins
}
Former Nagios Employee.
me.
peterg
Posts: 12 Joined: Wed Oct 07, 2015 11:13 am
Post
by peterg » Tue Oct 13, 2015 9:01 am
Ok, here's an example check on the nagios server (from
/etc/nagios3/conf.d/clients.cfg ):
Code: Select all
define command {
command_name check_load_nrpe
command_line $USER1$/check_nrpe -H "$HOSTADDRESS" -c "check_load"
}
...
define service {
use gs-generic-service
host_name cogs.example.com
service_description Load check NRPE
check_command check_load_nrpe
}
$USER1$ is defined in
/etc/nagios3/resource.cfg as
$USER1$=/usr/lib/nagios/plugins
and on the monitored server (from
/etc/nagios/nrpe_local.cfg ):
Code: Select all
######################################
# Do any local nrpe configuration here
######################################
command[check_users]=/usr/lib/nagios/plugins/check_users -w 5 -c 10
command[check_load]=/usr/lib/nagios/plugins/check_load -w 15,10,5 -c 30,25,20
command[check_all_disks]=/usr/lib/nagios/plugins/check_disk -w 20% -c 10%
command[check_zombie_procs]=/usr/lib/nagios/plugins/check_procs -w 5 -c 10 -s Z
command[check_total_procs]=/usr/lib/nagios/plugins/check_procs -w 150 -c 200
command[check_swap]=/usr/lib/nagios/plugins/check_swap -w 20 -c 10
Hopefully I've not missed anything.
Pete
hsmith
Agent Smith
Posts: 3539 Joined: Thu Jul 30, 2015 11:09 am
Location: 127.0.0.1
Contact:
Post
by hsmith » Tue Oct 13, 2015 1:11 pm
Can I see what is in gs-generic-service ?
Former Nagios Employee.
me.
Box293
Too Basu
Posts: 5126 Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:
Post
by Box293 » Tue Oct 13, 2015 8:13 pm
peterg wrote: If I run the check on the server as my user id, nagios or root it works perfectly:
Code: Select all
ubuntu@nagios01:~$ /usr/lib/nagios/plugins/check_nrpe -H 10.99.1.6 -c check_users
USERS OK - 1 users currently logged in |users=1;5;10;0
However, in Nagios the service status is 'WARNING' and the Status Information field is '(null)' for all my nrpe checks. Localhost checks, ping and SSH remote checks are fine but they don't use nrpe. The log file doesn't record anything for the nrpe checks.
peterg wrote: Code: Select all
define service {
use gs-generic-service
host_name cogs.example.com
service_description Load check NRPE
check_command check_load_nrpe
}
Can you please post the host object defintion for cogs.example.com.
Is the address for cogs.example.com 10.99.1.6 ?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new
Privacy Policy .
peterg
Posts: 12 Joined: Wed Oct 07, 2015 11:13 am
Post
by peterg » Wed Oct 14, 2015 7:00 am
hsmith wrote: Can I see what is in gs-generic-service ?
Sure:
Code: Select all
define service {
name gs-generic-service
check_interval 10
check_period 24x7
retry_interval 2
max_check_attempts 3
notification_interval 30
notification_period 24x7
notification_options w,c,r
contact_groups admins
}