Views

Nagios XI:FAQs

From Nagios Support Wiki

Revision as of 09:12, 30 March 2011 by Mguthrie (Talk | contribs) ("Event Data Is Stale")

Back To Nagios XI Overview

Answers to Frequently Asked Questions (FAQs) regarding Nagios XI can be found here.


Contents

FAQs

What Are FAQs? Frequently Asked Questions, or "FAQs", are answers to questions that are frequently asked in some context.


Common Problems - Try These Solutions First

Follow these steps if you are encountering problems with Nagios XI. These actions solve many commonly asked questions.

  • Clear your browser's cache to get the newest XI javascript code.
    Instructions on how to do it.
  • How To Reset Security Credentials (if performance graphs aren't displayed)
    Select the Reset Security Credentials option in the Admin section and click Update.
  • How To Reset File Permissions (if configuration changes are not taking effect)
    Instructions how.
  • Debugging Configuration Change Problems (if configuration changes are not taking effect)
    Write configuration file tool.


Supported Distributions

Nagios XI is currently supported with the following Linux distributions for both 32 and 64 bit installations:

  • CentOS 5 (Recommended)
  • RHEL 5

Although we plan to support the new version of RHEL 6 in the near future, we currently do not offer support for it.


Capabilities

Is Nagios XI capable of Distributed Monitoring?

Yes it is! There are multiple options for Distributed Monitoring with Nagios.

Nagios Fusion *New*

Nagios Core (the underlying monitoring engine) can be configured for distributed monitoring. For more information, read the Nagios Core documentation on distributed monitoring.

Distributed Monitoring with DNX


Is it possible to use SMS alerts for a custom SMS gateway?

Yes! Nagios XI sends SMS alerts by via email. Although we currently don't have a solution that allows users to define custom SMS gateways, the best way to get around this is to define a contact with an email address that will send the SMS message. Email address examples are as follows:

 <phonenumber>@smsgateway.domain
 1235551234@messaging.sprintpcs.com (send SMS via sprint)
 1235551234@tmomail.net (send SMS via t-mobile)

System Configuration Problems

Problems Using Nagios XI With Proxies

We do not officially support Nagios XI when you install and use proxy software that restricts traffic to or from the Nagios XI server. There are several reasons for this. First, Nagios XI requires external access for package installation and updates. Package installation and updates may not work when proxies are used. Additionally, the Nagios XI code makes several internal HTTP calls to the local Nagios XI server to import configuration data, apply configuration changes, process AJAX requests, etc. These functions may not work properly when you deploy a proxy, which would result in a non-functional Nagios XI installation.

There are two things that need to be configured to make XI installation work with a proxy; the yum and wget configurations. Do both of these before starting anything about the installation process.

In /etc/yum.conf :

 proxy=http://someproxyserver:port/ # Shouldn't need to be quoted, remember the trailing slash
 proxy_username=myname  # The username you authenticate to your proxy with, if applicable
 proxy_password=mypass  # The password you provide to your proxy, if applicable

In /etc/wgetrc :

 http_proxy=http://myname:mypass@someproxyserver:port/ # All in one string this time
 no_proxy=localhost,127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16 # Hosts to exclude from proxying

Quoting is not needed (or helpful) in any of these, but if you have special characters in passwords (especially : or @) and are having problems you probably need to escape them with backslashes.

Installation and Upgrade Problems

SourceGuardian Errors

After upgrading to 2009R1.2C, some users started getting an error about SourceGuardian. Add this line to your /etc/php.ini file:

 extension=ixed.5.1.lin

Once you make that change, restart Apache:

 service restart httpd
Resolving "DB Connect Error [nagiosxi]: Database connection failed"

The problem we identified with gnome was that the PATH for the "service" command gets changed under gnome. This needs to be set correctly so that the scripts starting with 3-dbservers will run correctly. You can test if the path is set correctly by trying the following commands:

service httpd restart
service postgresql restart

The important thing is that it includes the "sbin" directories. Normally it would look like this, although this isn't the only "correct" answer possible:

/usr/kerberos/sbin:/usr/kerberos/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
Resolving "NSP: Sorry Dave, I can't let you do that" Errors

Session protection was added to 2009R1.2C to prevent CSRF attacks. This code to do this caused some users to see this error. The problem was due to the user's browser caching older versions of the XI javascript code. In order to clear the cache and prevent this from happening, you need to clear your browser's cache. This is typically done (in Firefox) by holding down the shift key and clicking reload. See Other well documented procedures on clearing the browser cache.

The other possible cause of this is that the XI server's time is out of sync with the web browser. Try the following:

 yum install ntp
 ntpdate time.nist.gov


If that still doesn't fix the error, then you may have to specify your timezone in your /etc/php.ini file. Newer releases of PHP require this setting for your server to reflect the correct system time and timezone. To change this setting, edit the /etc/php.ini file with the following line:

 date.timezone = Etc/GMT-13

Change the timezone to match your location. These zones are listed at the following URL. PHP Timezones After changing the setting, restart your apache server:

 service httpd restart
"HTTP 500 Error"/"PHP Parse error - Unexpected $end"

For those doing manual installations, some of the tools embedded in Nagios XI use the PHP short tags feature, which is not necessarily enabled on all web servers by default. To fix this issue, locate your php.ini file (located at /etc/php.ini for CentOS installations), and verify that "short_open_tag" is set to "on." We intend to use full tags for future version, but some components and addons may still use them, so we recommend leaving this setting to "on."

Configuration Problems

Apply Configuration Fails: General Troubleshooting

If you receive an error while attempting to Apply Configuration stating that the configuration verification has failed, then that means there is some sort of syntax error or configuration conflict the configuration that's been defined. You can isolate this issue by accessing the Core Config Manager->Configuration Snapshots page. You should see the most recent snapshot highlighted in red. View the text file from the snapshot to see what config file contained the error. You can then find that file in the associated tar.gz file and search for the problem based on the error message. The snapshot represents the information that is CURRENTLY in the CCM database, that Nagios attempted to save. You'll need to correct the issue through the Core Config Manager, then attempt to Apply Configuration again.

The Write Config Tool in the CCM is a manual tool for writing the DB information to the configuration files (it manually Applies Configuration). It's important to know that Nagios cannot start or restart with a bad configuration. The config verification must pass in order for Nagios to be able to restart successfully with the new configuration.


Configuration Applies, but still get "Configuration File Is Out Of Date" Error

If your configuration is applying successfully and the changes are visible in the XI interface, but you're still seeing an error message in the CCM that says "Configuration File Is Out Of Date", then you may have to specify your timezone in your /etc/php.ini file. Newer releases of PHP require this setting for your server to reflect the correct system time and timezone. To change this setting, edit the /etc/php.ini file with the following line:

 date.timezone = Etc/GMT-13

Change the timezone to match your location. These zones are listed at the following URL. PHP Timezones After changing the setting, restart your apache server:

 service httpd restart
Apply Configuration Page Stalls Out, Never Completes

If you attempt to Apply Configuration and you're seeing the following output:

 * Configuration submitted for processing...
 * Waiting for configuration verification.................. 

and the configuration never applies, the page may be timing out. If you've recent updated XI, try restarting the server first. If that does not resolve the issue, try editing the configuration for your PHP settings. Open /etc/php.ini file in a text editor and increase the following values.

 ;;;;;;;;;;;;;;;;;;;
 ; Resource Limits ;
 ;;;;;;;;;;;;;;;;;;;
 max_execution_time = 30     ; Maximum execution time of each script, in  seconds
 max_input_time = 60     ; Maximum amount of time each script may spend parsing request data
 memory_limit = 128M      ; Maximum amount of memory a script may consume 


After this, run:

 service httpd restart


Configuration Applies, No Changes Take Place

This is generally due to permissions issues with the configuration file. Use the Write Config Tool in the Core Config Manager to see if you can manually write the DB information to the config files. If the Write Config Tool returns error messages related to permissions you can run the following script to correct the permission settings:

 /usr/local/nagiosxi/reset_config_perms

There is a known bug in XI 1.3E and F where this script was not automatically running when configurations were applied. If you're running a Nagios XI version earlier than 1.3g, we recommend updating to correct this issue.

Modifying The Contents Of /usr/local/nagios/etc
  • You can keep custom configuration files in the /usr/local/nagios/etc/static directory
  • Don't modify config files directly in /usr/local/nagios/etc, as they will be overwritten by the Core Config Manager
Unable To Delete Hosts

Hosts can only be deleted after all of their dependent services and associated relationships have been deleted. Make sure to delete any associated services or other objects before deleting the host.

Host Still Visible After Deletion: (Ghost Hosts)

If you have successfully deleted a host and all of it's services from the Core Config Manager, but you're still seeing it in the status tables, then you most likely have multiple instances of Nagios running on your machine. To make sure all instances are stopped, type the following in the command-line.

 killall nagios
 service nagios start


Host Still Visible In XI After Deletion From the CCM

Go to the Core Config Manager->Write Config Tool, and use that tool to manually write out the configuration data to file. Verify your configuration. If it verifies, go ahead and restart Nagios.

If by chance the host and all of it's services are completely deleted in the Core Config Manager, and the actual host config file is still there after using the Write Config Tool, then go ahead and delete the config file. The files will be located in the following directories.

/usr/local/nagios/etc/hosts
/usr/local/nagios/etc/services

On rare occasions the CCM will somehow lose a file, we haven't nailed down what causes it, but it is usually related to deleting the host.

Network status map parent/child relationship not updating(v1.3)

Underneath the Parents box in the CCM, make sure the "standard" radio button is selected. If "null" is selected your parent host selection doesn't get written to disk. We're working on a method of fixing the CCM so this doesn't happen with several fields.


Core Config Manager Problems
GUI Issues

Most of these are related to IE's implementation of JavaScript. If possible, use a browser that more closely implements the ECMAScript Language Specification.

Configuration Changes

If you make changes to your configuration and they are not reflected in XI, it may be due to file permissions. Here are two options to try:

  • Reset File Permissions

Execute the following command to reset your configuration file permissions.

 /usr/local/nagiosxi/scripts/reset_config_perms

You can also view if you have any permissions related issues by accessing the Admin->Check File Permissions page in the XI interface (v1.3g+).

Making A Mass Change In The CCM

Changing The Field Entry For A Large Amount Of Objects

Occasionally admins need to change a specific settings for a huge quantity of services or hosts, and this change can't be made from a template. Although we highly recommend the use of templating whenever possible, sometimes it's just not possible to make the change there. Our unofficial solution for this is to write a SQL query that will manually update the DB fields where you need them change. NOTE: Test your queries on a single test host/service first, and try this solution at your own risk, we are not responsible if you break something with this! Here's an example a user posted of a change made to the check_interval for all 'Disk Monitor' services.

 mysql> use nagiosql;
 mysql> update tbl_service set check_interval=60 where service_description='Disk Monitor';
 mysql> select config_name, service_description, check_interval from tbl_service where service_description='Disk Monitor';

If the change you wanted was successful, Apply Configuration to write the changes to the config files.

Using Scripts To Make Changes in the CCM

Some admins make use of internal scripts to update and maintain their monitoring environment. Although we're only able to offer limited support on a situation like this, a useful script to know about is:

 /usr/local/nagiosxi/scripts/reconfigure_nagios.sh  

This is the command-line version of "Apply Configuration" in the XI interface. It will write the CCM DB info to the config files and restart Nagios.

Performance Graph Problems

Performance Graphs Are Missing Or Not Displayed

This can happen for a variety of reasons, but there are several simple solutions that resolve this issue for most people:

  • Reset Security Credentials
    Select the Reset Security Credentials option in the Admin menu and click Update
  • Reset File Permissions

Execute the following command to reset your configuration file permissions.

 /usr/local/nagiosxi/scripts/reset_config_perms

You can also view if you have any permissions related issues by accessing the Admin->Check File Permissions page in the XI interface (v1.3g+).

  • Make sure you have not removed or renamed the nagiosadmin user. This user is the nagios equivalent to 'root user' and should never be removed.
  • Some users reported that editing the following lines in their /usr/local/nagios/etc/nagios.cfg file fixed their graphing issues:
 service_perfdata_file_processing_command=process-service-perfdata-file-bulk
 host_perfdata_file_processing_command=process-host-perfdata-file-bulk

Change To

 service_perfdata_file_processing_command=process-service-perfdata-file-pnp-bulk
 host_perfdata_file_processing_command=process-host-perfdata-file-pnp-bulk
  • Make sure your password for Nagios XI only contains alpha-numeric characters. Some users have reported graphs disappearing from using special characters, creating a permissions issue.
  • Performance graphs are pulled via an internal proxy, so users with their Nagios server behind their own proxy or using strict SSL settings may experience problems viewing graphs. If you're using an environment with a proxy or SSL and having issues viewing graphs post the problem to our support forums and specify your use of proxy or SSL right away.
  • Further reading
    Forum Article.

Notification Problems

Nagios Admin Account Notifications Not Controlled Through XI
  • The nagiosadmin user was set to use the generic_template contact template, which resulted in notifications not being controlled through the XI interface.
    This can be corrected by changing the user's contact template to be xi_generic_template is the Core Config Manager. This bug was corrected in 2009R1.2 and only effects systems that had/have previous versions installed.
Email Notifications Are Not Going Out

This can happen for a variety of reasons:

  • The nagiosadmin is set to use the generic_template contact template.
    This should be xi_generic_template, and can be modified by using the Core Config Manager. This bug was corrected in 2009R1.2 and only effects systems that had/have previous versions installed.
  • Outbound SMTP connections may be blocked by your border firewall
  • Unauthenticated SMTP relaying may be denied somewhere downstream - try switching email methods from Sendmail to SMTP in the admin section


XI Display Problems

Tables Displaying A Count, But No Results

A recent issue has been identified where characters outside of the ASCII table are being generated by some of the check plugins, which causes an issue with XI's XML generation. The result is a table with a returned count of services, but no actual table data. This issue can be verified by checking the following url:

 http://<serveraddress>/nagiosxi/backend/?cmd=getservicestatus

If this XML page returns an error, it should identify the line number of the issue which can be found in the page source. Below is a code patch that will be included in the next update of XI. Paste this code as a replacement to the xmlentities() function on line 30 of the /usr/local/nagiosxi/html/includes/utilsx.inc.php

 function xmlentities($string){
       $data=str_replace ( array ( '&', '"', "'", '<', '>' ), 
        array ( '&' , '"', ''' , '<' , '>' ), $string );
       preg_match_all('/([\x09\x0a\x0d\x20-\x7e]'. // ASCII characters
       '|[\xc2-\xdf][\x80-\xbf]'. // 2-byte (except overly longs)
       '|\xe0[\xa0-\xbf][\x80-\xbf]'. // 3 byte (except overly longs)
       '|[\xe1-\xec\xee\xef][\x80-\xbf]{2}'. // 3 byte (except overly longs)
       '|\xed[\x80-\x9f][\x80-\xbf])+/', // 3 byte (except UTF-16 surrogates)
       $data, $clean_pieces );
       $clean_output = join('?', $clean_pieces[0] );
       return $clean_output;
       }

Other Issues

Login Screen Keeps Redirecting To Itself

The web browser keeps redirecting to the login screen even after entering login credentials. This has been noticed in Internet Explorer.

Nagios XI uses cookies to save session state. These cookies are set to expire after 30 minutes. If the time on the Nagios XI server is incorrect, the cookies returned to the client's browser might appear to be expired due to the time difference between the client's computer and the Nagios XI server. Solution: Fix the time on the Nagios XI server to ensure it is correct.


Check Services Being Orphaned

Some users have encountered large numbers of warning messages that accumulate quickly that read as follows:

Warning: The check of service <Your Service> on host <Your Host> looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service..

This is most likely caused by multiple instances of Nagios running. To fix this kill all instances of Nagios and then restart the process.

 killall -9 nagios

Then restart Nagios from the Admin menu of the web interface.

Related forum post can be read here.


XI Component/Addon Problems

Website Wizard Content Check Failure

Some users have reported website content checks being blocked by the "dotDefender" application. See the following forum thread for the solution. Website Wizard Content Check Failure


"Event Data Is Stale"

We've had a known bug relating to event data in versions 2009R1.4B-2011R1.1. This bug has been patched and will be available in releases later than the versions posted above, but if you're experiencing this error, and/or the nagios service is taking an excessively long time to start, you may have a corrupted mysql table that needs repair. We suggest taking the following steps.

Stop the following services

 service nagios stop
 service ndo2db stop
 service mysqld stop

Run the our repair script for mysql tables.

 /usr/local/nagiosxi/scripts/repairmyql.sh

Unzip and copy the the following dbmaint file to /usr/local/nagiosxi/cron/ This will overwrite the previous version.

Run the following commands:

 service mysqld start
 rm -f /usr/local/nagiosxi/var/dbmaint.lock
 /usr/local/nagiosxi/cron/dbmaint.php

This is a script for cleaning and optimizing database tables. If you see any error output from this script (if tables are still showing as crashed), contact our support team at oursupport forums. Otherwise, you should be able to restart services.

 service ndo2db start
 service nagios start

If problems continue to persist, contact our support team.


// ?>