Nagios XI monitoring engine down

rajasegar · Post by **rajasegar** » Sun Jun 08, 2014 8:09 pm

Nagios XI 2012R2.9
RHEL 6.5 x64
Manual Install
Firefox 23

1) The Nagios monitoring engine went down unexpectedly over the weekend.

For the NCPD log this is the last entry. I suppose Nagios monitoring engine died at the same time.
[06-08-2014 00:13:06] NPCD: WARN: MAX load reached: load 12.310000/10.000000 at i=1
Please advice how to check this issue.

Restart of nagios services and npcd did not solve all the problems as I cannot do any immediate check and performance data was not updating.
Had to reboot the machine to get back to normal.

2) How to monitor the parameters in this dashboard via command line

09-06-2014 09-08-08 AM.png

slansing · Post by **slansing** » Mon Jun 09, 2014 9:18 am

How did you shut down the machine? If you did it incorrectly, you may need to follow this now as well:

http://assets.nagios.com/downloads/nagi ... tabase.pdf

As far as performance data not being graphed at the point in time you are referencing, that is because you hit NPCD's load limit. That limit is in place so you don't allow NPCD to eat all of your resources trying to crunch on performance data.

rajasegar · Post by **rajasegar** » Mon Jun 09, 2014 6:56 pm

slansing wrote:How did you shut down the machine? If you did it incorrectly, you may need to follow this now as well:

http://assets.nagios.com/downloads/nagi ... tabase.pdf

As far as performance data not being graphed at the point in time you are referencing, that is because you hit NPCD's load limit. That limit is in place so you don't allow NPCD to eat all of your resources trying to crunch on performance data.

I am sure NagiosXI installation scripts added proper shutdown scripts for shutdown/reboots right?
Anyway my XI DB is offloaded to another VM. and it is 95% - 99% idle most of the time.
Not a single error message in /var/log/mysqld.log.

1) I still want to find out why my monitoring engine went down.
2) How to monitor the monitoring engine statistics?

Thanks.

scottwilkerson · Post by **scottwilkerson** » Tue Jun 10, 2014 11:12 am

Nagios XI comes with a wizard that can check XI systems, you could run it for localhost to get the command.

It uses the check_nagiosxiserver.php plugin which will use the username and backend ticket found in the Backend API Component

Admin -> Manage Components -> Backend API Component -> Settings

check_nagiosxiserver - Copyright (c) 2010 Nagios Enterprises, LLC.
Portions Copyright(c) others (see source code).

Usage:
check_nagiosxiserver.php <option>

Options:

--address=<addres> The address of the Nagios XI server

--url=<url> The URL used to access the Nagios XI web interface

--username=<username> The username used for accessing the server

--ticket=<ticket> The ticket used for accessing the server

--timeout=<seconds> Seconds before plugin times out (default=<? echo $timeout;?>)

--debug=<0/1> Enables/disables debugging output

--mode=<mode> Operating mode of the plugin. Valid modes include:
daemons Checks the status of the core Nagios XI daemons to ensure
they're running properly.
jobs Checks the status of the core Nagios XI jobs to ensure
they're running properly.
iowait Checks the I/O wait CPU statistics.
load Checks the 1,5,15 minutes load statistics.

--warn=<warning> The warning values used for some modes (iowait, load)

--crit=<critical> The critical values used for some modes (iowait, load)

This plugin checks the status of a remote Nagios XI server.

rajasegar · Post by **rajasegar** » Tue Jun 10, 2014 6:37 pm

scottwilkerson wrote:Nagios XI comes with a wizard that can check XI systems, you could run it for localhost to get the command.

It uses the check_nagiosxiserver.php plugin which will use the username and backend ticket found in the Backend API Component

Admin -> Manage Components -> Backend API Component -> Settings

check_nagiosxiserver - Copyright (c) 2010 Nagios Enterprises, LLC.
Portions Copyright(c) others (see source code).

Usage:
check_nagiosxiserver.php <option>

Options:

--address=<addres> The address of the Nagios XI server

--url=<url> The URL used to access the Nagios XI web interface

--username=<username> The username used for accessing the server

--ticket=<ticket> The ticket used for accessing the server

--timeout=<seconds> Seconds before plugin times out (default=<? echo $timeout;?>)

--debug=<0/1> Enables/disables debugging output

--mode=<mode> Operating mode of the plugin. Valid modes include:
daemons Checks the status of the core Nagios XI daemons to ensure
they're running properly.
jobs Checks the status of the core Nagios XI jobs to ensure
they're running properly.
iowait Checks the I/O wait CPU statistics.
load Checks the 1,5,15 minutes load statistics.

--warn=<warning> The warning values used for some modes (iowait, load)

--crit=<critical> The critical values used for some modes (iowait, load)

This plugin checks the status of a remote Nagios XI server.

Thanks. I will check this out. Do you have anything similar for Nagios Core?

scottwilkerson · Post by **scottwilkerson** » Wed Jun 11, 2014 9:40 am

maybe check_nagios will accomplish all you need

Code: Select all

# /usr/local/nagios/libexec/check_nagios -h
check_nagios v2.0.2 (nagios-plugins 2.0.2)
Copyright (c) 1999-2014 Nagios Plugin Development Team
        <[email protected]>

This plugin checks the status of the Nagios process on the local machine
The plugin will check to make sure the Nagios status log is no older than
the number of minutes specified by the expires option.
It also checks the process table for a process matching the command argument.


Usage:
check_nagios -F <status log file> -t <timeout_seconds> -e <expire_minutes> -C <process_string>

Options:
 -h, --help
    Print detailed help screen
 -V, --version
    Print version information
 --extra-opts=[section][@file]
    Read options from an ini file. See
    https://www.nagios-plugins.org/doc/extra-opts.html
    for usage and examples.
 -F, --filename=FILE
    Name of the log file to check
 -e, --expires=INTEGER
    Minutes aging after which logfile is considered stale
 -C, --command=STRING
    Substring to search for in process arguments
 -t, --timeout=INTEGER
    Timeout for the plugin in seconds
 -v, --verbose
    Show details for command-line debugging (Nagios may truncate output)

Examples:
 check_nagios -t 20 -e 5 -F /usr/local/nagios/var/status.log -C /usr/local/nagios/bin/nagios

Send email to [email protected] if you have questions regarding use
of this software. To submit patches or suggest improvements, send email to
[email protected]

rajasegar · Post by **rajasegar** » Wed Jun 11, 2014 6:40 pm

scottwilkerson wrote:maybe check_nagios will accomplish all you need

Code: Select all

# /usr/local/nagios/libexec/check_nagios -h
check_nagios v2.0.2 (nagios-plugins 2.0.2)
Copyright (c) 1999-2014 Nagios Plugin Development Team
        <[email protected]>

This plugin checks the status of the Nagios process on the local machine
The plugin will check to make sure the Nagios status log is no older than
the number of minutes specified by the expires option.
It also checks the process table for a process matching the command argument.


Usage:
check_nagios -F <status log file> -t <timeout_seconds> -e <expire_minutes> -C <process_string>

Options:
 -h, --help
    Print detailed help screen
 -V, --version
    Print version information
 --extra-opts=[section][@file]
    Read options from an ini file. See
    https://www.nagios-plugins.org/doc/extra-opts.html
    for usage and examples.
 -F, --filename=FILE
    Name of the log file to check
 -e, --expires=INTEGER
    Minutes aging after which logfile is considered stale
 -C, --command=STRING
    Substring to search for in process arguments
 -t, --timeout=INTEGER
    Timeout for the plugin in seconds
 -v, --verbose
    Show details for command-line debugging (Nagios may truncate output)

Examples:
 check_nagios -t 20 -e 5 -F /usr/local/nagios/var/status.log -C /usr/local/nagios/bin/nagios

Send email to [email protected] if you have questions regarding use
of this software. To submit patches or suggest improvements, send email to
[email protected]

Yes found everything I need. Thanks
Please close this case.

Nagios Support Forum

Nagios XI monitoring engine down

Nagios XI monitoring engine down

Re: Nagios XI monitoring engine down

Re: Nagios XI monitoring engine down

Re: Nagios XI monitoring engine down

Re: Nagios XI monitoring engine down

Re: Nagios XI monitoring engine down

Re: Nagios XI monitoring engine down