Some NRPE checks are in UNKNOWN state after fresh install

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
dlukinski
Posts: 1130
Joined: Tue Oct 06, 2015 9:42 am

Re: Some NRPE checks are in UNKNOWN state after fresh instal

Post by dlukinski »

rkennedy wrote:
[root@fikc-nagxidev01 ~]# /usr/local/nagios/libexec/check_nrpe -H 10.x.x.102 -c check_cpu_stats -a '-w 85 -c 95'
NRPE: Unable to read output
[root@fikc-nagxidev01 ~]# /usr/local/nagios/libexec/check_nrpe -H 10.x.x.95 -c check_cpu_stats -a '-w 85 -c 95'
NRPE: Unable to read output
[root@fikc-nagxidev01 ~]# /usr/local/nagios/libexec/check_nrpe -H 10.x.x.121 -c check_cpu_stats -a '-w 85 -c 95'
NRPE: Unable to read output
[root@fikc-nagxidev01 ~]# /usr/local/nagios/libexec/check_nrpe -H 10.x.x.102 -c check_mem -a '-w 20 -c 10'
NRPE: Unable to read output
[root@fikc-nagxidev01 ~]# /usr/local/nagios/libexec/check_nrpe -H 10.x.x.95 -c check_mem -a '-w 20 -c 10'
NRPE: Unable to read output
[root@fikc-nagxidev01 ~]# /usr/local/nagios/libexec/check_nrpe -H 10.x.x.121 -c check_mem -a '-w 20 -c 10'
NRPE: Unable to read output
[root@fikc-nagxidev01 ~]# /usr/local/nagios/libexec/check_nrpe -H 10.x.x.102 -c check_open_files -a '-w 30 -c 50'
NRPE: Unable to read output
[root@fikc-nagxidev01 ~]# /usr/local/nagios/libexec/check_nrpe -H 10.x.x.95 -c check_open_files -a '-w 30 -c 50'
NRPE: Unable to read output
[root@fikc-nagxidev01 ~]# /usr/local/nagios/libexec/check_nrpe -H 10.x.x.121 -c check_open_files -a '-w 30 -c 50'
NRPE: Unable to read output
[root@fikc-nagxidev01 ~]#
You're using a different host in every single one, can we please stick to troubleshooting one machine at a time to avoid any confusion?
On the RHEL and SUSE machine, can you run ls -l /usr/local/nagios/libexec/? I suspect the permissions aren't right, which is preventing NRPE from running your script.
I didn't see the permissions verified, can you please run the command from above?

What is the result now of you running /usr/local/nagios/libexec/check_cpu_stats.sh -w 85 -c 95 with the fixed bash script?

Also - I don't see an entry for check_cpu_stats in your NRPE configuration. You'll need one similar to this -

Code: Select all

command[check_cpu_stats]=/usr/local/nagios/libexec/check_cpu_stats.sh -w 85 -c 95
Then restart xinetd to update your configuration - service xinetd restart

Now, from the XI machine try running a check against the machine you just modified (tcmigra1). /usr/local/nagios/libexec/check_nrpe -H tcmigra1 -c check_cpu_stats. What is the result?
1.Reason there are 3 different hosts is to show that in all cases we get same troubles with NRPE agent & XI Wizard provided by Nagios.

2. Files listed:

Code: Select all

fihp-alfdev04 ~]# ls -l /usr/local/nagios/libexec/	
total 6928	
-rwxr-xr-x 1 root   root   201149 Feb 13 14:42 check_apt	
-rwx------ 1 root   root     6897 Feb 13 14:42 check_asterisk.pl	
-rwx------ 1 root   root     1978 Feb 13 14:42 check_asterisk_sip_peers.sh	
-rwxr-xr-x 1 root   root     2242 Feb 13 14:42 check_breeze	
-rwxr-xr-x 1 root   root   197474 Feb 13 14:42 check_by_ssh	
lrwxrwxrwx 1 root   root        9 Feb 13 14:42 check_clamd -> check_tcp	
-rwxr-xr-x 1 root   root   151117 Feb 13 14:42 check_cluster	
-rwx------ 1 root   root     7312 Mar  8 17:43 check_cpu_stats.sh	
-rwx------ 1 root   root     5355 Feb 13 14:42 check_cpu_stats.sh.old	
-r-sr-xr-x 1 root   root   188614 Feb 13 14:42 check_dhcp	
-rwxr-xr-x 1 root   root   192412 Feb 13 14:42 check_dig	
-rwxr-xr-x 1 root   root   207820 Feb 13 14:42 check_disk	
-rwxr-xr-x 1 root   root     9289 Feb 13 14:42 check_disk_smb	
-rwxr-xr-x 1 root   root   207226 Feb 13 14:42 check_dns	
-rwxr-xr-x 1 root   root    93404 Feb 13 14:42 check_dummy	
-rwxr-xr-x 1 root   root     3349 Feb 13 14:42 check_file_age	
-rwxr-xr-x 1 root   root     6315 Feb 13 14:42 check_flexlm	
lrwxrwxrwx 1 root   root        9 Feb 13 14:42 check_ftp -> check_tcp	
-rwxr-xr-x 1 root   root   364831 Feb 13 14:42 check_http	
-r-sr-xr-x 1 root   root   193126 Feb 13 14:42 check_icmp	
-rwxr-xr-x 1 root   root   158955 Feb 13 14:42 check_ide_smart	
-rwxr-xr-x 1 root   root    15123 Feb 13 14:42 check_ifoperstatus	
-rwxr-xr-x 1 root   root    12600 Feb 13 14:42 check_ifstatus	
lrwxrwxrwx 1 root   root        9 Feb 13 14:42 check_imap -> check_tcp	
-rws------ 1 root   nagios    748 Feb 13 14:42 check_init_service	
-rwxr-xr-x 1 root   root     6887 Feb 13 14:42 check_ircd	
lrwxrwxrwx 1 root   root        9 Feb 13 14:42 check_jabber -> check_tcp	
-rwxr-xr-x 1 root   root   171310 Feb 13 14:42 check_ldap	
lrwxrwxrwx 1 root   root       10 Feb 13 14:42 check_ldaps -> check_ldap	
-rwxr-xr-x 1 root   root   184887 Feb 13 14:42 check_load	
-rwxr-xr-x 1 root   root     5989 Feb 13 14:42 check_log	
-rwxr-xr-x 1 root   root    21480 Feb 13 14:42 check_mailq	
-rwxr-xr-x 1 root   root   157405 Feb 13 14:42 check_mrtg	
-rwxr-xr-x 1 root   root   158058 Feb 13 14:42 check_mrtgtraf	
-rwxr-xr-x 1 root   root   175433 Feb 13 14:42 check_nagios	
-rwx------ 1 root   root    25602 Feb 13 14:42 check_netstat.pl	
lrwxrwxrwx 1 root   root        9 Feb 13 14:42 check_nntp -> check_tcp	
lrwxrwxrwx 1 root   root        9 Feb 13 14:42 check_nntps -> check_tcp	
-rwxrwxr-x 1 nagios nagios  69790 Feb 13 14:42 check_nrpe	
-rwxr-xr-x 1 root   root   188462 Feb 13 14:42 check_nt	
-rwxr-xr-x 1 root   root   193361 Feb 13 14:42 check_ntp	
-rwxr-xr-x 1 root   root   184970 Feb 13 14:42 check_ntp_peer	
-rwxr-xr-x 1 root   root   184083 Feb 13 14:42 check_ntp_time	
-rwxr-xr-x 1 root   root   211503 Feb 13 14:42 check_nwstat	
-rwx------ 1 root   root     3259 Feb 13 14:42 check_open_files.pl	
-rwxr-xr-x 1 root   root     8779 Feb 13 14:42 check_oracle	
-rwxr-xr-x 1 root   root   172313 Feb 13 14:42 check_overcr	
-rwxr-xr-x 1 root   root   213097 Feb 13 14:42 check_ping	
lrwxrwxrwx 1 root   root        9 Feb 13 14:42 check_pop -> check_tcp	
-rwxr-xr-x 1 root   root   200744 Feb 13 14:42 check_procs	
-rwxr-xr-x 1 root   root   170235 Feb 13 14:42 check_real	
-rwxr-xr-x 1 root   root     9581 Feb 13 14:42 check_rpc	
-rwxr-xr-x 1 root   root     1453 Feb 13 14:42 check_sensors	
-rwx------ 1 root   root     2174 Feb 13 14:42 check_services	
lrwxrwxrwx 1 root   root        9 Feb 13 14:42 check_simap -> check_tcp	
-rwx------ 1 root   root     7599 Feb 13 14:42 check_sip	
-rwxr-xr-x 1 root   root   254037 Feb 13 14:42 check_smtp	
lrwxrwxrwx 1 root   root        9 Feb 13 14:42 check_spop -> check_tcp	
-rwxr-xr-x 1 root   root   170183 Feb 13 14:42 check_ssh	
lrwxrwxrwx 1 root   root        9 Feb 13 14:42 check_ssmtp -> check_tcp	
-rwxr-xr-x 1 root   root   156705 Feb 13 14:42 check_swap	
-rwxr-xr-x 1 root   root   230311 Feb 13 14:42 check_tcp	
-rwxr-xr-x 1 root   root   173229 Feb 13 14:42 check_time	
lrwxrwxrwx 1 root   root        9 Feb 13 14:42 check_udp -> check_tcp	
-rwxr-xr-x 1 root   root   179469 Feb 13 14:42 check_ups	
-rwxr-xr-x 1 root   root   151595 Feb 13 14:42 check_uptime	
-rwxr-xr-x 1 root   root   150241 Feb 13 14:42 check_users	
-rwxr-xr-x 1 root   root     2936 Feb 13 14:42 check_wave	
-rwx------ 1 root   root      710 Feb 13 14:42 check_yum	
-rwx------ 1 root   root     3060 Feb 13 14:42 custom_check_mem	
-rwx------ 1 root   root      915 Feb 13 14:42 custom_check_procs	
-rwx------ 1 root   root     4176 Feb 13 14:42 nagisk.pl	
-rwxr-xr-x 1 root   root   142062 Feb 13 14:42 negate	
-rwx------ 1 root   root    63579 Feb 13 14:42 send_nsca	
-rwxr-xr-x 1 root   root   148408 Feb 13 14:42 urlize	
-rwxr-xr-x 1 root   root     1913 Feb 13 14:42 utils.pm	
-rwxr-xr-x 1 root   root     2791 Feb 13 14:42 utils.sh	
[root@fihp-alfdev04 ~]# uname -a	
Linux fihp-alfdev04 2.6.18-406.el5 #1 SMP Fri May 1 10:37:57 EDT 2015 x86_64 x86_64 x86_64 GNU/Linux	
3. What is the result now of you running /usr/local/nagios/libexec/check_cpu_stats.sh -w 85 -c 95 with the fixed bash script?
- not sure what you mean (works locally)

4. Also - I don't see an entry for check_cpu_stats in your NRPE configuration. You'll need one similar to this -

Code: Select all

command[check_cpu_stats]=/usr/local/nagios/libexec/check_cpu_stats.sh -w 85 -c 95
- present in common.cfg as command[check_cpu_stats]=/usr/local/nagios/libexec/check_cpu_stats.sh $ARG1$

if this is stock NRPE agent, from Nagios assets, which works out of the box for CentOS, why would this line be suddenly missing ?!
check_mem is there as well as command[check_mem]=/usr/local/nagios/libexec/custom_check_mem -n $ARG1$
Last edited by tmcdonald on Tue Mar 08, 2016 2:25 pm, edited 1 time in total.
Reason: Please use [code][/code] tags around long output
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Some NRPE checks are in UNKNOWN state after fresh instal

Post by lmiltchev »

[root@fikc-nagxidev01 ~]# /usr/local/nagios/libexec/check_nrpe -H 10.x.x.102 -c check_cpu_stats -a '-w 85 -c 95'
NRPE: Unable to read output
Let's step back for a minute, and try troubleshooting one of the boxes. Run the following commands and show the output:

On the 10.x.x.102 (remote box)

Code: Select all

ip addr | grep global | grep -m 1 'inet' | awk '/inet[^6]/{print substr($2,0)}' | sed 's|/.*||'
grep check_cpu_stats /usr/local/nagios/etc/nrpe/common.cfg
find / -name nrpe
ps axuw | grep nrpe
netstat -at | grep nrpe
grep only_from /etc/xinetd.d/nrpe
grep "dont_blame_nrpe=" /usr/local/nagios/etc/nrpe.cfg
/usr/local/nagios/libexec/check_cpu_stats.sh -w 85 -c 90
su nagios
/usr/local/nagios/libexec/check_cpu_stats.sh -w 85 -c 90
On the Nagios XI server

Code: Select all

ip addr | grep global | grep -m 1 'inet' | awk '/inet[^6]/{print substr($2,0)}' | sed 's|/.*||'
nmap 10.x.x.102 -p 5666
/usr/local/nagios/libexec/check_nrpe -H 10.x.x.102
/usr/local/nagios/libexec/check_nrpe -H 10.x.x.102 -c check_cpu_stats -a '-w 85 -c 95'

Note: You may need to change the permissions on the failing plugins (on the remote box), i.e.

Code: Select all

chmod 755 /usr/local/nagios/libexec/check_cpu_stats.sh
Be sure to check out our Knowledgebase for helpful articles and solutions!
dlukinski
Posts: 1130
Joined: Tue Oct 06, 2015 9:42 am

Re: Some NRPE checks are in UNKNOWN state after fresh instal

Post by dlukinski »

Worked in all 3 cases after applying chmod 755 against 3 scripts in question (CPU, Memory, Open Files)
- wonder why script did not behave same way as with other default installs
- good thing is that we have workaround
- no other changes done but this one
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Some NRPE checks are in UNKNOWN state after fresh instal

Post by lmiltchev »

Worked in all 3 cases after applying chmod 755 against 3 scripts in question (CPU, Memory, Open Files)
I am glad I could help! :)
- wonder why script did not behave same way as with other default installs
It is hard to say. I have installed the Linux agent (NRPE + Nagios plugins) on many different systems, and have never seen this particular issue. Let us know if you run into the same problem in the future.

Is it safe to lock this thread, and mark it as "resolved"? Thank you!
Be sure to check out our Knowledgebase for helpful articles and solutions!
dlukinski
Posts: 1130
Joined: Tue Oct 06, 2015 9:42 am

Re: Some NRPE checks are in UNKNOWN state after fresh instal

Post by dlukinski »

lmiltchev wrote:
Worked in all 3 cases after applying chmod 755 against 3 scripts in question (CPU, Memory, Open Files)
I am glad I could help! :)
- wonder why script did not behave same way as with other default installs
It is hard to say. I have installed the Linux agent (NRPE + Nagios plugins) on many different systems, and have never seen this particular issue. Let us know if you run into the same problem in the future.

Is it safe to lock this thread, and mark it as "resolved"? Thank you!
Yes please close
Locked