Linux Ubuntu install - troubleshoot services

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
gbnag
Posts: 26
Joined: Thu Jan 21, 2016 7:58 pm

Linux Ubuntu install - troubleshoot services

Post by gbnag »

hello,

Installed the linux agent for ubuntu and finding a number of services are either unknown or in warning or critical state (see attached).
How would we rectify to show them as normal. The target system does run the cron and ssh daemons.
Could not locate the command 'check_total_procs' on the nagiosxi server under /usr/local/nagios/libexec/, but service shows as critical.

ty
You do not have the required permissions to view the files attached to this post.
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Linux Ubuntu install - troubleshoot services

Post by Box293 »

How did you install the linux agent for ubutnu, did you follow a specific guide?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
gbnag
Posts: 26
Joined: Thu Jan 21, 2016 7:58 pm

Re: Linux Ubuntu install - troubleshoot services

Post by gbnag »

We did get an installation error, i.e:

----------------------------------------------
Processing triggers for libc-bin (2.19-0ubuntu6.9) ...
Prerequisites installed OK
RESULT=0
Running './2-usersgroups'...
Adding users and groups...
useradd: user 'nagios' already exists
groupadd: group 'nagios' already exists
useradd: user 'nagios' already exists
ERROR: User 'nagios' was not created - exiting.
RESULT=1

===================
INSTALLATION ERROR!
===================
Installation step failed - exiting.
-----------------------------------------------------------


when we addressed the installation error during a quick start session it was suggested in order to bypass this error, add 'touch installed.usersgroups' under the linux-nrpe-agent directory and rerun the full install script, which we did.



------------------------------------------------------------------

Installation finished without errors:

.........
......................
xinetd stop/waiting
xinetd start/running, process 26802
Subcomponents installed OK
RESULT=0

##########################################################
### ###
### Nagios XI Linux Agent Installation Complete! ###
### ###
##########################################################

If you experience any problems, please attach the file install.log that was just created to any support requests.

root@dmg-dev:/tmp/linux-nrpe-agent#
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Linux Ubuntu install - troubleshoot services

Post by lmiltchev »

Not all of these services are added via the "Linux Server" wizard... Did you add "Current Load", and "Current Users" manually?

Show us the actual commands for the failing services run from the command line (on the Nagios XI server) along with the output.

Also, run the following command on the remote (Ununtu) box:

Code: Select all

cat /usr/local/nagios/etc/nrpe/common.cfg
Be sure to check out our Knowledgebase for helpful articles and solutions!
gbnag
Posts: 26
Joined: Thu Jan 21, 2016 7:58 pm

Re: Linux Ubuntu install - troubleshoot services

Post by gbnag »

We performed a default install.

From the XI server, we can run checks for load (shows green
[root@localhost-010049098179 libexec]# /usr/local/nagios/libexec/check_nrpe -H dmg-dev -t 30 -c check_load -a '-w 5 -c 10'
OK - load average: 2.22, 2.22, 2.29|load1=2.220;5.000;10.000;0; load5=2.220;5.000;10.000;0; load15=2.290;5.000;10.000;0;
[root@localhost-010049098179 libexec]#

However unable to find a 'Current Load' or 'Current Users 'check (both services showing orange).

Total Processes (service showing red) unable to find the check for it either (see attached).


For ssh server (service showing yellow) getting:
[root@localhost-010049098179 libexec]# /usr/local/nagios/libexec/check_nrpe -H dmg-dev -t 30 -c check_ssh
NRPE: Command 'check_ssh' not defined

For cron scheduling daemon (service showing yellow), unable to locate a check script.




Below is the common.cfg output from the target ubuntu host:
----------------------------------------------------------------------------

root@dmg-dev:~# cat /usr/local/nagios/etc/nrpe/common.cfg

### GENERIC SERVICES ###
command[check_init_service]=sudo /usr/local/nagios/libexec/check_init_service $ARG1$
command[check_services]=/usr/local/nagios/libexec/check_services -p $ARG1$

### MISC SYSTEM METRICS ###
#command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
command[check_users]=/usr/local/nagios/libexec/check_users $ARG1$
command[check_load]=/usr/local/nagios/libexec/check_load $ARG1$
command[check_swap]=/usr/local/nagios/libexec/check_swap $ARG1$
command[check_cpu_stats]=/usr/local/nagios/libexec/check_cpu_stats.sh $ARG1$
command[check_mem]=/usr/local/nagios/libexec/custom_check_mem -n $ARG1$

### SYSTEM UPDATES ###
command[check_yum]=/usr/local/nagios/libexec/check_yum
command[check_apt]=/usr/local/nagios/libexec/check_apt

### DISK ###
command[check_disk]=/usr/local/nagios/libexec/check_disk $ARG1$
command[check_ide_smart]=/usr/local/nagios/libexec/check_ide_smart $ARG1$

### PROCESSES ###
command[check_all_procs]=/usr/local/nagios/libexec/custom_check_procs
command[check_procs]=/usr/local/nagios/libexec/check_procs $ARG1$

### OPEN FILES ###
command[check_open_files]=/usr/local/nagios/libexec/check_open_files.pl $ARG1$

### NETWORK CONNECTIONS ###
command[check_netstat]=/usr/local/nagios/libexec/check_netstat.pl -p $ARG1$ $ARG2$root@dmg-dev:~#
You do not have the required permissions to view the files attached to this post.
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Linux Ubuntu install - troubleshoot services

Post by Box293 »

gbnag wrote:For ssh server (service showing yellow) getting:
[root@localhost-010049098179 libexec]# /usr/local/nagios/libexec/check_nrpe -H dmg-dev -t 30 -c check_ssh
NRPE: Command 'check_ssh' not defined
There is no check_ssh command defined in the NRPE client as part of the install. You would need to define it in common.cfg on the Ubuntu server if you wanted to execute it via NRPE.

However with check_ssh generally this is not a check you would execute via NRPE, you would execute it from the XI server. In this case you would edit your service in CCM and select the command check_xi_service_ssh.
gbnag wrote:However unable to find a 'Current Load' or 'Current Users 'check (both services showing orange).
In CCM, for these services please click the floppy disk icon.
This will bring up the text definition of the services.
Please paste the text definitions here in a code block.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
gbnag
Posts: 26
Joined: Thu Jan 21, 2016 7:58 pm

Re: Linux Ubuntu install - troubleshoot services

Post by gbnag »

Code: Select all

###############################################################################
#
# Service configuration file
#
# Created by: Nagios Core Config Manager 2.5.2
# Date:	      2016-09-29 16:08:45
# Version:    Nagios 3.x config file
#
# --- DO NOT EDIT THIS FILE BY HAND --- 
# Nagios CCM will overwrite all manual settings during the next update if you 
# would like to edit files manually, place them in the 'static' directory or 
# import your configs into the CCM by placing them in the 'import' directory.
#
###############################################################################

define service {
	host_name			dmg-dev.qualcomm.com
	service_description		/ Disk Usage
	use				xiwizard_nrpe_service
	check_command			check_nrpe!check_disk!-a '-w 20% -c 10% -p /'
	max_check_attempts		5
	check_interval			5
	retry_interval			1
	check_period			xi_timeperiod_24x7
	notification_interval		60
	notification_period		xi_timeperiod_24x7
	contacts			nagiosadmin
	_xiwizard			linux-server
	register			1
	}	

define service {
	host_name			dmg-dev.qualcomm.com
	service_description		APT Updates
	use				xiwizard_nrpe_service
	check_command			check_nrpe!check_apt!-a '-U'
	max_check_attempts		5
	check_interval			5
	retry_interval			1
	check_period			xi_timeperiod_24x7
	notification_interval		60
	notification_period		xi_timeperiod_24x7
	contacts			nagiosadmin
	_xiwizard			linux-server
	register			1
	}	

define service {
	host_name			dmg-dev.qualcomm.com
	service_description		CPU Stats
	use				xiwizard_nrpe_service
	check_command			check_nrpe!check_cpu_stats!-a '-w 85 -c 95'
	max_check_attempts		5
	check_interval			5
	retry_interval			1
	check_period			xi_timeperiod_24x7
	notification_interval		60
	notification_period		xi_timeperiod_24x7
	contacts			nagiosadmin
	_xiwizard			linux-server
	register			1
	}	

define service {
	host_name			dmg-dev.qualcomm.com
	service_description		Cron Scheduling Daemon
	use				xiwizard_nrpe_service
	check_command			check_nrpe!check_init_service!-a 'cron'
	max_check_attempts		5
	check_interval			5
	retry_interval			1
	check_period			xi_timeperiod_24x7
	notification_interval		60
	notification_period		xi_timeperiod_24x7
	contacts			nagiosadmin
	_xiwizard			linux-server
	register			1
	}	

define service {
	host_name			dmg-dev.qualcomm.com
	service_description		Current Load
	use				generic-service
	check_command			check_nrpe!check_load!
	max_check_attempts		5
	check_interval			5
	retry_interval			1
	check_period			xi_timeperiod_24x7
	notification_interval		60
	notification_period		xi_timeperiod_24x7
	contacts			nagiosadmin
	_xiwizard			nrpe
	register			1
	}	

define service {
	host_name			dmg-dev.qualcomm.com
	service_description		Current Users
	use				generic-service
	check_command			check_nrpe!check_users!
	max_check_attempts		5
	check_interval			5
	retry_interval			1
	check_period			xi_timeperiod_24x7
	notification_interval		60
	notification_period		xi_timeperiod_24x7
	contacts			nagiosadmin
	_xiwizard			nrpe
	register			1
	}	

define service {
	host_name			dmg-dev.qualcomm.com
	service_description		Load
	use				xiwizard_nrpe_service
	check_command			check_nrpe!check_load!-a '-w 15,10,5 -c 30,20,10'
	max_check_attempts		5
	check_interval			5
	retry_interval			1
	check_period			xi_timeperiod_24x7
	notification_interval		60
	notification_period		xi_timeperiod_24x7
	contacts			nagiosadmin
	_xiwizard			linux-server
	register			1
	}	

define service {
	host_name			dmg-dev.qualcomm.com
	service_description		Memory Usage
	use				xiwizard_nrpe_service
	check_command			check_nrpe!check_mem!-a '-w 20 -c 10'
	max_check_attempts		5
	check_interval			5
	retry_interval			1
	check_period			xi_timeperiod_24x7
	notification_interval		60
	notification_period		xi_timeperiod_24x7
	contacts			nagiosadmin
	_xiwizard			linux-server
	register			1
	}	

define service {
	host_name			dmg-dev.qualcomm.com
	service_description		Node Javascript process
	use				xiwizard_nrpe_service
	check_command			check_nrpe!check_services!-a 'node'
	max_check_attempts		5
	check_interval			5
	retry_interval			1
	check_period			xi_timeperiod_24x7
	notification_interval		60
	notification_period		xi_timeperiod_24x7
	contacts			nagiosadmin
	_xiwizard			linux-server
	register			1
	}	

define service {
	host_name			dmg-dev.qualcomm.com
	service_description		Open Files
	use				xiwizard_nrpe_service
	check_command			check_nrpe!check_open_files!-a '-w 30 -c 50'
	max_check_attempts		5
	check_interval			5
	retry_interval			1
	check_period			xi_timeperiod_24x7
	notification_interval		60
	notification_period		xi_timeperiod_24x7
	contacts			nagiosadmin
	_xiwizard			linux-server
	register			1
	}	

define service {
	host_name			dmg-dev.qualcomm.com
	service_description		Ping
	use				xiwizard_linuxserver_ping_service
	max_check_attempts		5
	check_interval			5
	retry_interval			1
	check_period			xi_timeperiod_24x7
	notification_interval		60
	notification_period		xi_timeperiod_24x7
	contacts			nagiosadmin
	_xiwizard			nrpe
	register			1
	}	

define service {
	host_name			dmg-dev.qualcomm.com
	service_description		Swap Usage
	use				xiwizard_nrpe_service
	check_command			check_nrpe!check_swap!-a '-w 50 -c 20'
	max_check_attempts		5
	check_interval			5
	retry_interval			1
	check_period			xi_timeperiod_24x7
	notification_interval		60
	notification_period		xi_timeperiod_24x7
	contacts			nagiosadmin
	_xiwizard			linux-server
	register			1
	}	

define service {
	host_name			dmg-dev.qualcomm.com
	service_description		Total Processes
	use				generic-service
	check_command			check_nrpe!check_total_procs!
	max_check_attempts		5
	check_interval			5
	retry_interval			1
	check_period			xi_timeperiod_24x7
	notification_interval		60
	notification_period		xi_timeperiod_24x7
	contacts			nagiosadmin
	_xiwizard			nrpe
	register			1
	}	

define service {
	host_name			dmg-dev.qualcomm.com
	service_description		Users
	use				xiwizard_nrpe_service
	check_command			check_nrpe!check_users!-a '-w 5 -c 10'
	max_check_attempts		5
	check_interval			5
	retry_interval			1
	check_period			xi_timeperiod_24x7
	notification_interval		60
	notification_period		xi_timeperiod_24x7
	contacts			nagiosadmin
	_xiwizard			linux-server
	register			1
	}	

###############################################################################
#
# Service configuration file
#
# END OF FILE
#
###############################################################################
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Linux Ubuntu install - troubleshoot services

Post by Box293 »

gbnag wrote:

Code: Select all

define service {
	host_name			dmg-dev.qualcomm.com
	service_description		Current Load
	use				generic-service
	check_command			check_nrpe!check_load!
	max_check_attempts		5
	check_interval			5
	retry_interval			1
	check_period			xi_timeperiod_24x7
	notification_interval		60
	notification_period		xi_timeperiod_24x7
	contacts			nagiosadmin
	_xiwizard			nrpe
	register			1
	}	

define service {
	host_name			dmg-dev.qualcomm.com
	service_description		Current Users
	use				generic-service
	check_command			check_nrpe!check_users!
	max_check_attempts		5
	check_interval			5
	retry_interval			1
	check_period			xi_timeperiod_24x7
	notification_interval		60
	notification_period		xi_timeperiod_24x7
	contacts			nagiosadmin
	_xiwizard			nrpe
	register			1
	}
Both of these checks do not have anything defined in $ARG2$ hence this is why it's failing.

For the "Current Load" check, in CCM edit the service and add the following to the $ARG2$ field:
-a '-w 15,10,5 -c 30,20,10'

For the "Current Users" check, in CCM edit the service and add the following to the $ARG2$ field:
-a '-w 5 -c 10'

Apply Config when done.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
gbnag
Posts: 26
Joined: Thu Jan 21, 2016 7:58 pm

Re: Linux Ubuntu install - troubleshoot services

Post by gbnag »

that worked, thanks.

Now, the values of 'current users' and 'users' is identical as are the values of 'Current load' and 'load'.
Just trying to understand why by default there are 2 variables reporting same values, except the 'Current users and current load did have issues reporting before making the suggested adjustments.

Also, could you guide us getting the Total Processes to show green?
You do not have the required permissions to view the files attached to this post.
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Linux Ubuntu install - troubleshoot services

Post by lmiltchev »

Now, the values of 'current users' and 'users' is identical as are the values of 'Current load' and 'load'.
Just trying to understand why by default there are 2 variables reporting same values, except the 'Current users and current load did have issues reporting before making the suggested adjustments.
The reason I asked you this:
Did you add "Current Load", and "Current Users" manually?
is because these services are NOT included by default in the "Linux Server" wizard (not for Ubuntu anyway). See an example of "default" services added by the wizard below:
example01.PNG
Did you select "Ubuntu" from the "Linux Distribution" drop-down menu in Step 1 of the "Linux Server" wizard?
Also, could you guide us getting the Total Processes to show green?
You will need to modify the "Total Processes" service under the CCM - add:
-a '-w 150 -c 250'
to the $ARG2$ field. Save, and apply configuration.

Note: 150 & 250 are the "default" WARNING & CRITICAL threshold values. Feel free to use whatever values are relevant in your environment.
You do not have the required permissions to view the files attached to this post.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked