process monitoring

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
lafargeuser
Posts: 341
Joined: Thu Sep 27, 2012 12:23 am

process monitoring

Post by lafargeuser »

Can we have a script in Nagios which .. searches pids of all process having name as "java" .. and monitor only those pids which are
belong to the application.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: process monitoring

Post by scottwilkerson »

You should be able to do this with check_procs and the -p flag once you know all the PPID's

Code: Select all

[root@localhost libexec]# ./check_procs -h
check_procs v2019 (nagios-plugins 1.4.13)
Copyright (c) 1999 Ethan Galstad <[email protected]>
Copyright (c) 2000-2008 Nagios Plugin Development Team
        <[email protected]>

Checks all processes and generates WARNING or CRITICAL states if the specified
metric is outside the required threshold ranges. The metric defaults to number
of processes.  Search filters can be applied to limit the processes to check.


Usage: check_procs -w <range> -c <range> [-m metric] [-s state] [-p ppid]
 [-u user] [-r rss] [-z vsz] [-P %cpu] [-a argument-array]
 [-C command] [-t timeout] [-v]

Options:
 -h, --help
    Print detailed help screen
 -V, --version
    Print version information
 -w, --warning=RANGE
   Generate warning state if metric is outside this range
 -c, --critical=RANGE
   Generate critical state if metric is outside this range
 -m, --metric=TYPE
  Check thresholds against metric. Valid types:
  PROCS   - number of processes (default)
  VSZ     - virtual memory size
  RSS     - resident set memory size
  CPU     - percentage cpu
  ELAPSED - time elapsed in seconds
 -t, --timeout=INTEGER
    Seconds before connection times out (default: 10)
 -v, --verbose
    Extra information. Up to 3 verbosity levels

Filters:
 -s, --state=STATUSFLAGS
   Only scan for processes that have, in the output of `ps`, one or
   more of the status flags you specify (for example R, Z, S, RS,
   RSZDT, plus others based on the output of your 'ps' command).
 -p, --ppid=PPID
   Only scan for children of the parent process ID indicated.
 -z, --vsz=VSZ
   Only scan for processes with vsz higher than indicated.
 -r, --rss=RSS
   Only scan for processes with rss higher than indicated.
 -P, --pcpu=PCPU
   Only scan for processes with pcpu higher than indicated.
 -u, --user=USER
   Only scan for processes with user name or ID indicated.
 -a, --argument-array=STRING
   Only scan for processes with args that contain STRING.
 --ereg-argument-array=STRING
   Only scan for processes with args that contain the regex STRING.
 -C, --command=COMMAND
   Only scan for exact matches of COMMAND (without path).

RANGEs are specified 'min:max' or 'min:' or ':max' (or 'max'). If
specified 'max:min', a warning status will be generated if the
count is inside the specified range

This plugin checks the number of currently running processes and
generates WARNING or CRITICAL states if the process count is outside
the specified threshold ranges. The process count can be filtered by
process owner, parent process PID, current state (e.g., 'Z'), or may
be the total number of running processes

Examples:
 check_procs -w 2:2 -c 2:1024 -C portsentry
  Warning if not two processes with command name portsentry.
  Critical if < 2 or > 1024 processes

 check_procs -w 10 -a '/usr/local/bin/perl' -u root
  Warning alert if > 10 processes with command arguments containing
  '/usr/local/bin/perl' and owned by root

 check_procs -w 50000 -c 100000 --metric=VSZ
  Alert if vsz of any processes over 50K or 100K

 check_procs -w 10 -c 20 --metric=CPU
  Alert if cpu of any processes over 10%% or 20%%

Send email to [email protected] if you have questions
regarding use of this software. To submit patches or suggest improvements,
send email to [email protected]
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
lafargeuser
Posts: 341
Joined: Thu Sep 27, 2012 12:23 am

Re: process monitoring

Post by lafargeuser »

Servers are being montored currently. I need to just add this processes & frankly speaking i have no idea how to add process in XI.
I tried using CCM but not sure about fields which i have to put. Since its production server i cant do much R&D.

I need guidance on same.
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: process monitoring

Post by mguthrie »

See the doc below on Managing Plugins in Nagios XI. It will walk you through the process.
http://assets.nagios.com/downloads/nagi ... hp#plugins
lafargeuser
Posts: 341
Joined: Thu Sep 27, 2012 12:23 am

Re: process monitoring

Post by lafargeuser »

Hi,

There are 20 of servers, on i will be monitoring Java processes. i tried deleting one of Host & reconfigure with Java process & it worked.
But there are number of server, i cant go and reconfigure those. So i have tried CCM as well but it didnt worked.
So how can i add those process using CCM-Service-Add service. For your reference, below is the process.

/u01/app/webmethods/wmpyis01/jvm
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: process monitoring

Post by scottwilkerson »

Can you post the config for the one you got working, then I can walk you through how to setup the others in the CCM

Configure -> CCM -> Services
Click disk icon next to the item that is working
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
lafargeuser
Posts: 341
Joined: Thu Sep 27, 2012 12:23 am

Re: process monitoring

Post by lafargeuser »

Now, i have added JVM process successfully under Service Management.Also i can see JVM Service & Sync Stautus is Synced.
However, i cant see JVM service under that particular hosts.
i have restarted Nagios service also applied configuration.

Also, Pasting configuration file below for your reference,as unable to attached.

Code: Select all

###############################################################################
#
# Service configuration file
#
# Created by: Nagios QL Version 3.0.3
# Date:	      2012-11-02 07:11:57
# Version:    Nagios 3.x config file
#
# --- DO NOT EDIT THIS FILE BY HAND --- 
# Nagios QL will overwite all manual settings during the next update
#
###############################################################################

define service {
	host_name			lwmvidb001
	service_description		/ Disk Usage
	use				xiwizard_nrpe_service
	check_command			check_nrpe!check_disk!-a '-w 20% -c 10% -p /'
	max_check_attempts		5
	check_interval			5
	retry_interval			1
	check_period			xi_timeperiod_24x7
	notification_interval		60
	notification_period		xi_timeperiod_24x7
	contacts			nagiosadmin
	_xiwizard			linux-server
	register			1
	}	

define service {
	host_name			lwmvidb001
	service_description		/u01 Disk Usage
	use				xiwizard_nrpe_service
	check_command			check_nrpe!check_disk!-a '-w 20% -c 10% -p /u01'
	max_check_attempts		5
	check_interval			5
	retry_interval			1
	check_period			xi_timeperiod_24x7
	notification_interval		60
	notification_period		xi_timeperiod_24x7
	contacts			nagiosadmin
	_xiwizard			linux-server
	register			1
	}	

define service {
	host_name			lwmvidb001
	service_description		/u02 Disk Usage
	use				xiwizard_nrpe_service
	check_command			check_nrpe!check_disk!-a '-w 20% -c 10% -p /u02'
	max_check_attempts		5
	check_interval			5
	retry_interval			1
	check_period			xi_timeperiod_24x7
	notification_interval		60
	notification_period		xi_timeperiod_24x7
	contacts			nagiosadmin
	_xiwizard			linux-server
	register			1
	}	

define service {
	host_name			lwmvidb001
	service_description		CPU Stats
	use				xiwizard_nrpe_service
	check_command			check_nrpe!check_cpu_stats!-a '-w 85 -c 95'
	max_check_attempts		5
	check_interval			5
	retry_interval			1
	check_period			xi_timeperiod_24x7
	notification_interval		60
	notification_period		xi_timeperiod_24x7
	contacts			nagiosadmin
	_xiwizard			linux-server
	register			1
	}	

define service {
	host_name			lwmvidb001
	service_description		JVM
	use				xiwizard_nrpe_service
	check_command			check_nrpe!check_services!-a '/u01/app/webmethods/wmdvis01/jvm'
	max_check_attempts		5
	check_interval			5
	retry_interval			1
	check_period			xi_timeperiod_24x7
	notification_interval		60
	notification_period		xi_timeperiod_24x7
	contacts			nagiosadmin
	_xiwizard			linux-server
	register			1
	}	

define service {
	host_name			lwmvidb001
	service_description		Memory Usage
	use				xiwizard_nrpe_service
	check_command			check_nrpe!check_mem!-a '-w 20 -c 10'
	max_check_attempts		5
	check_interval			5
	retry_interval			1
	check_period			xi_timeperiod_24x7
	notification_interval		60
	notification_period		xi_timeperiod_24x7
	contacts			nagiosadmin
	_xiwizard			linux-server
	register			1
	}	

define service {
	host_name			lwmvidb001
	service_description		Ping
	use				xiwizard_linuxserver_ping_service
	max_check_attempts		5
	check_interval			5
	retry_interval			1
	check_period			xi_timeperiod_24x7
	notification_interval		60
	notification_period		xi_timeperiod_24x7
	contacts			nagiosadmin
	_xiwizard			linux-server
	register			1
	}	

###############################################################################
#
# Service configuration file
#
# END OF FILE
#
###############################################################################
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: process monitoring

Post by scottwilkerson »

are you saying JMV doesn't sow up under lwmvidb001 ?
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
lafargeuser
Posts: 341
Joined: Thu Sep 27, 2012 12:23 am

Re: process monitoring

Post by lafargeuser »

Yes. However, i can see JVM service under service manageent also checked configuration.
But JVM service i am not able to see in services tab of main XI dashboard.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: process monitoring

Post by scottwilkerson »

Can you run the following

Code: Select all

service nagios stop
service ndo2db stop
killall -9 nagios
killall -9 ndo2db 
service nagios start
service ndo2db start
Also, have you setup a RAM disk on this machine?
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Locked