NRPE intermittent connection but able to provide output

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
s.wiki
Posts: 82
Joined: Sat Mar 04, 2017 11:02 am

NRPE intermittent connection but able to provide output

Post by s.wiki »

Hi,

I am having issue on intermittent connection on Nagios XI server in which one of our linux server after up to date system been caused this issue.

I trying to install latest nrpe 3.2.1 using Source but still come out same issue.

From localhost able to provide the output.

Code: Select all

[root@server network-scripts]# /usr/local/nagios/libexec/check_nrpe -H localhost
NRPE v3.2.1
[root@server network-scripts]# /usr/local/nagios/libexec/check_cpu_stats2.sh
CPU STATISTICS OK : user=0.51%, system=0.00%, iowait=0.00%, nice=0.00%, steal=0.00%, cpu_usage=0.51% |TotalCpuUsage=0.51%;80;90;0; CpuUser=0.51%;0;0;0; CpuSystem=0.00%;0;0;0; CpuIowait=0.00%;0;0;0; CpuNice=0.00%;0;0;0; CpuSteal=0.00%;0;0;0;
While I trying to use Core Config Manager > Services > plugins which mentioned below it's able to provide output.

Code: Select all

[nagios@imsvm ~]$ /usr/local/nagios/libexec/check_nrpe -H 10.90.10.115 -t 30 -c check_disk -a 15% 10% /
DISK OK - free space: / 41 GB (90% inode=99%);| /=4GB;41;44;0;49

[nagios@imsvm ~]$ /usr/local/nagios/libexec/check_nrpe -H 10.90.10.115 -t 30 -c check_cpu2 -a 85 90
CPU STATISTICS OK : user=0.50%, system=0.50%, iowait=0.00%, nice=0.00%, steal=0.00%, cpu_usage=1.00% |TotalCpuUsage=1.00%;85;90;0; CpuUser=0.50%;0;0;0; CpuSystem=0.50%;0;0;0; CpuIowait=0.00%;0;0;0; CpuNice=0.00%;0;0;0; CpuSteal=0.00%;0;0;0;
But from Nagios XI was getting invalid data including check_icmp as well as attached picture.
Capture.PNG
You do not have the required permissions to view the files attached to this post.
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: NRPE intermittent connection but able to provide output

Post by benjaminsmith »

Hell @s.wiki,

Given that the ping check is returning 100% packet loss, it certainly looks like a network connection issue.

You might try increasing the timeout to -t 60 and as it might be taking longer than 30 seconds to return the results.

Make sure the Nagios XI global timeout is set to 60 seconds for both host and service check commands. You'll also want to change the connection_timeout on the NRPE client to 60 as well.

We have instructions for this procedure available here:

NRPE - CHECK_NRPE: Socket Timeout After n Seconds

After making configuration changes, be sure to restart both nrpe and nagios. Let me know if you're able to get results again from Nagios XI.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
s.wiki
Posts: 82
Joined: Sat Mar 04, 2017 11:02 am

Re: NRPE intermittent connection but able to provide output

Post by s.wiki »

Hi @benjaminsmith,

I did increase timeout to set both Nagios and agent to 30 timeout and we cannot simply increase to -t 60, because I have to configure all agent together (correct me if wrong).

And I do able to ping from Host > ping to host as well, it's working fine.
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: NRPE intermittent connection but able to provide output

Post by benjaminsmith »

Hello @s.wiki,

On the remote host running NRPE, run the following command and post the output:

Code: Select all

cat /usr/local/nagios/etc/nrpe.cfg | grep command_timeout=
cat /usr/local/nagios/etc/nrpe.cfg  | grep connection_timeout=
Could you PM your system profile for us to review. Thanks.

To send us your system profile.

Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and share in a private message or upload it to the post/ticket.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
s.wiki
Posts: 82
Joined: Sat Mar 04, 2017 11:02 am

Re: NRPE intermittent connection but able to provide output

Post by s.wiki »

Hi @benjaminsmith,

Below is the output for remote host and also already private message you the profile.zip

Code: Select all

[root@server ~]# cat /usr/local/nagios/etc/nrpe.cfg | grep command_timeout=
command_timeout=300
[root@server ~]# cat /usr/local/nagios/etc/nrpe.cfg | grep connection_timeout=
connection_timeout=300
[root@server ~]#
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: NRPE intermittent connection but able to provide output

Post by benjaminsmith »

Hello @s.wiki,

Thanks for sending over the system profile. I noticed your having a few other hosts with check_nrpe. Are those hosts timing out as well or is just for the host (ip address end 115)?

If you're only experiencing the issue on that host and it's intermittent yet working from the command line, please test one of those services by adding the -t 60 argument to the check command.

Also, which version of NRPE are you running on the remote host?

Thanks.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked