Page 1 of 1
Nagios XI monitoring engine down
Posted: Sun Jun 08, 2014 8:09 pm
by rajasegar
Nagios XI 2012R2.9
RHEL 6.5 x64
Manual Install
Firefox 23
1) The Nagios monitoring engine went down unexpectedly over the weekend.
For the NCPD log this is the last entry. I suppose Nagios monitoring engine died at the same time.
[06-08-2014 00:13:06] NPCD: WARN: MAX load reached: load 12.310000/10.000000 at i=1
Please advice how to check this issue.
Restart of nagios services and npcd did not solve all the problems as I cannot do any immediate check and performance data was not updating.
Had to reboot the machine to get back to normal.
2) How to monitor the parameters in this dashboard via command line
09-06-2014 09-08-08 AM.png
Re: Nagios XI monitoring engine down
Posted: Mon Jun 09, 2014 9:18 am
by slansing
How did you shut down the machine? If you did it incorrectly, you may need to follow this now as well:
http://assets.nagios.com/downloads/nagi ... tabase.pdf
As far as performance data not being graphed at the point in time you are referencing, that is because you hit NPCD's load limit. That limit is in place so you don't allow NPCD to eat all of your resources trying to crunch on performance data.
Re: Nagios XI monitoring engine down
Posted: Mon Jun 09, 2014 6:56 pm
by rajasegar
slansing wrote:How did you shut down the machine? If you did it incorrectly, you may need to follow this now as well:
http://assets.nagios.com/downloads/nagi ... tabase.pdf
As far as performance data not being graphed at the point in time you are referencing, that is because you hit NPCD's load limit. That limit is in place so you don't allow NPCD to eat all of your resources trying to crunch on performance data.
I am sure NagiosXI installation scripts added proper shutdown scripts for shutdown/reboots right?
Anyway my XI DB is offloaded to another VM. and it is 95% - 99% idle most of the time.
Not a single error message in /var/log/mysqld.log.
1) I still want to find out why my monitoring engine went down.
2) How to monitor the monitoring engine statistics?
Thanks.
Re: Nagios XI monitoring engine down
Posted: Tue Jun 10, 2014 11:12 am
by scottwilkerson
Nagios XI comes with a wizard that can check XI systems, you could run it for localhost to get the command.
It uses the check_nagiosxiserver.php plugin which will use the username and backend ticket found in the Backend API Component
Admin -> Manage Components -> Backend API Component -> Settings
check_nagiosxiserver - Copyright (c) 2010 Nagios Enterprises, LLC.
Portions Copyright(c) others (see source code).
Usage:
check_nagiosxiserver.php <option>
Options:
--address=<addres> The address of the Nagios XI server
--url=<url> The URL used to access the Nagios XI web interface
--username=<username> The username used for accessing the server
--ticket=<ticket> The ticket used for accessing the server
--timeout=<seconds> Seconds before plugin times out (default=<? echo $timeout;?>)
--debug=<0/1> Enables/disables debugging output
--mode=<mode> Operating mode of the plugin. Valid modes include:
daemons Checks the status of the core Nagios XI daemons to ensure
they're running properly.
jobs Checks the status of the core Nagios XI jobs to ensure
they're running properly.
iowait Checks the I/O wait CPU statistics.
load Checks the 1,5,15 minutes load statistics.
--warn=<warning> The warning values used for some modes (iowait, load)
--crit=<critical> The critical values used for some modes (iowait, load)
This plugin checks the status of a remote Nagios XI server.
Re: Nagios XI monitoring engine down
Posted: Tue Jun 10, 2014 6:37 pm
by rajasegar
scottwilkerson wrote:Nagios XI comes with a wizard that can check XI systems, you could run it for localhost to get the command.
It uses the check_nagiosxiserver.php plugin which will use the username and backend ticket found in the Backend API Component
Admin -> Manage Components -> Backend API Component -> Settings
check_nagiosxiserver - Copyright (c) 2010 Nagios Enterprises, LLC.
Portions Copyright(c) others (see source code).
Usage:
check_nagiosxiserver.php <option>
Options:
--address=<addres> The address of the Nagios XI server
--url=<url> The URL used to access the Nagios XI web interface
--username=<username> The username used for accessing the server
--ticket=<ticket> The ticket used for accessing the server
--timeout=<seconds> Seconds before plugin times out (default=<? echo $timeout;?>)
--debug=<0/1> Enables/disables debugging output
--mode=<mode> Operating mode of the plugin. Valid modes include:
daemons Checks the status of the core Nagios XI daemons to ensure
they're running properly.
jobs Checks the status of the core Nagios XI jobs to ensure
they're running properly.
iowait Checks the I/O wait CPU statistics.
load Checks the 1,5,15 minutes load statistics.
--warn=<warning> The warning values used for some modes (iowait, load)
--crit=<critical> The critical values used for some modes (iowait, load)
This plugin checks the status of a remote Nagios XI server.
Thanks. I will check this out. Do you have anything similar for Nagios Core?
Re: Nagios XI monitoring engine down
Posted: Wed Jun 11, 2014 9:40 am
by scottwilkerson
maybe check_nagios will accomplish all you need
Code: Select all
# /usr/local/nagios/libexec/check_nagios -h
check_nagios v2.0.2 (nagios-plugins 2.0.2)
Copyright (c) 1999-2014 Nagios Plugin Development Team
<[email protected]>
This plugin checks the status of the Nagios process on the local machine
The plugin will check to make sure the Nagios status log is no older than
the number of minutes specified by the expires option.
It also checks the process table for a process matching the command argument.
Usage:
check_nagios -F <status log file> -t <timeout_seconds> -e <expire_minutes> -C <process_string>
Options:
-h, --help
Print detailed help screen
-V, --version
Print version information
--extra-opts=[section][@file]
Read options from an ini file. See
https://www.nagios-plugins.org/doc/extra-opts.html
for usage and examples.
-F, --filename=FILE
Name of the log file to check
-e, --expires=INTEGER
Minutes aging after which logfile is considered stale
-C, --command=STRING
Substring to search for in process arguments
-t, --timeout=INTEGER
Timeout for the plugin in seconds
-v, --verbose
Show details for command-line debugging (Nagios may truncate output)
Examples:
check_nagios -t 20 -e 5 -F /usr/local/nagios/var/status.log -C /usr/local/nagios/bin/nagios
Send email to [email protected] if you have questions regarding use
of this software. To submit patches or suggest improvements, send email to
[email protected]
Re: Nagios XI monitoring engine down
Posted: Wed Jun 11, 2014 6:40 pm
by rajasegar
scottwilkerson wrote:maybe check_nagios will accomplish all you need
Code: Select all
# /usr/local/nagios/libexec/check_nagios -h
check_nagios v2.0.2 (nagios-plugins 2.0.2)
Copyright (c) 1999-2014 Nagios Plugin Development Team
<[email protected]>
This plugin checks the status of the Nagios process on the local machine
The plugin will check to make sure the Nagios status log is no older than
the number of minutes specified by the expires option.
It also checks the process table for a process matching the command argument.
Usage:
check_nagios -F <status log file> -t <timeout_seconds> -e <expire_minutes> -C <process_string>
Options:
-h, --help
Print detailed help screen
-V, --version
Print version information
--extra-opts=[section][@file]
Read options from an ini file. See
https://www.nagios-plugins.org/doc/extra-opts.html
for usage and examples.
-F, --filename=FILE
Name of the log file to check
-e, --expires=INTEGER
Minutes aging after which logfile is considered stale
-C, --command=STRING
Substring to search for in process arguments
-t, --timeout=INTEGER
Timeout for the plugin in seconds
-v, --verbose
Show details for command-line debugging (Nagios may truncate output)
Examples:
check_nagios -t 20 -e 5 -F /usr/local/nagios/var/status.log -C /usr/local/nagios/bin/nagios
Send email to [email protected] if you have questions regarding use
of this software. To submit patches or suggest improvements, send email to
[email protected]
Yes found everything I need. Thanks
Please close this case.