NRPE - General Troubleshooting Tips


Problem Description

This KB article provides a troubleshooting methodology for NRPE problems.

 

Assumed Knowledge

The following KB article contains an explanation of how NRPE works and may need to be referenced to completely understand the problem and solution that is provided here:

NRPE - Agent and Plugin Explained

 

Troubleshooting

When Troubleshooting NRPE issues, there is a general order of procedures for drilling down the problem. Start with the plugin itself, and then move to NRPE, and finally check your argument usage. If you follow the general steps below before dealing with support, your issue may be solved faster than expected as these are always the first steps a Nagios XI support representative will ask you to perform.

 

Test The Plugin Locally First

Log onto your remote server as root and and run the plugin (replace <name of plugin> with your plugin:

cd /usr/local/nagios/libexec/<name of plugin> 

 

If it does not work as expected, you may want to check the plugin's usage as you may find some hints to why it is not working:

cd /usr/local/nagios/libexec/<name of plugin> -h

 

You may have to set some thresholds, usually warning (-w) and critical (-c) for a large number of plugins before they will work correctly. Once the plugin has been tested and working locally from the remote host, create a command directive for it in the nrpe.cfg file or adjust your existing command. Take a mental note of how you setup your arguments.

 

 

Verify That NRPE Is Working Locally And Open To Requests From The XI Server

On the remote host, run:

service xinetd status

 

Or for systems not using xinetd:

service nrpe status

 

If NRPE is not running, follow the steps in this KB article:

NRPE - CHECK_NRPE: Error - Could Not Complete SSL Handshake

 

If NRPE is running, move on to testing the connection to the remote host from the XI server with check_nrpe. Log onto the Nagios XI server as root and run the following command, inserting the actual remote host IP address:

/usr/local/nagios/libexec/check_nrpe -H <remote host ip>

 

The command above should return the NRPE version of the remote host. If not, follow the steps in this KB article:

CHECK_NRPE: Error - Could not connect to xxx.xxx.xxx.xxx: Connection reset by peer

 

An important step in the previous command was to use the IP ADDRESS of the remote host. If your Nagios XI host object uses a DNS record, use that in the command above. It's possible that the DNS record is incorrect.

Another similar mistake is when you use Core Configuration Manager to copy an existing host to provision a new host. The newly copied host still has the address of the host it was copied from (your forgot to change it). Your command line tests work OK because you are forced to type the address at the command line. A good troubleshooting technique is to make sure when you do your testing at the command line, make sure the arguments you type in match the values in the Nagios XI Core Configuration Manager objects.

 

 

Try The Full Command From The Command Line Interface On The XI Server

From the Nagios XI server command line interface, run the following command:

/usr/local/nagios/libexec/check_nrpe -H <remote host ip> -c <command and arguments>

 

You will need to replace the remote host IP address and match your command and arguments to your command directives in your remote host nrpe.cfg.

If you do not get the expected output, check the plugin usage again to make sure your syntax is correct. As mentioned in the last step, a good troubleshooting technique is to make sure when you do your testing at the command line, make sure the arguments you type in match the values in the Nagios XI objects. Please refer to the following KB article for more information on this:

NRPE - Command ’[Your Plugin]’ Not Defined

 

 

Setup The Service Check In XI

Create a new service for the check by navigating within the Nagios XI web interface Configure > Core Config Manager > Monitoring > Services > Add New.

Specify the Config Name and Description for the check.

Use check_nrpe in the Check command drop-down.

Next, set up the command arguments under Command view.

$ARG1$ is the remote command to be sent to the remote host through NRPE.  This must match the command directive in the nrpe.cfg.

$ARG2$ is used for extra command arguments. Again, if you have defined any in the remote host's nrpe.cfg this must match them.

The check needs to be applied to a host, so click the Manage Hosts button. Select a host from the list and click Add Selected. You should see the host appear in the right hand pane under Assigned. Now click Close

Click the Check Settings tab. At minimum, we need to setup check intervals, attempts, and a period.

Last, click the Alert Settings and set the Notification period to "xi_timeperiod_24x7", or to the time period of your choice. This specifies the time period for notifications. (emails, SMS, etc). Click Manage Contacts and add a contact to the check if you want.

Finally, click Save and Apply Configuration.

Alternatively you can run the NRPE or Linux Server configuration wizards which do all this for you.

Now when you navigate to Service Detail you will see your service check listed. It may take a minute for the service to change from pending to a STATE. From this page you can verify that your plugin is executing as expected.

 

 

Final Thoughts

For any support related questions please visit the Nagios Support Forums at:

http://support.nagios.com/forum/



Article ID: 629
Created On: Mon, Jul 17, 2017 at 4:00 AM
Last Updated On: Mon, Jul 17, 2017 at 4:00 AM
Authored by: tlea

Online URL: https://support.nagios.com/kb/article/nrpe-general-troubleshooting-tips-629.html