passive service check & invalid hostname

feralsboy · Post by **feralsboy** » Tue Oct 01, 2013 6:52 pm

Hello,

Something odd is going on,and I'm hoping to get a little help.

I'm trying to run some passive checks on some linux boxes and getting errors that don't make sense.

error: PASSIVE SERVICE CHECK: $host;xmpp-bosh;2;TCP CRITICAL - Invalid hostname, address or socket: -p

things tried:

ping $host works.

main configuration file contains: cfg_file=/etc/nagios/$configdir/$newconfig.cfg

grep $host cfg_file gives:

host_name $host

$host:/usr/local/nagios_checks/$script
prints to the screen:

xmpp-server2server Port Status on 5269 is = TCP OK - 0.000 second response time on port 5269|time=0.000081s;;;0.000000;10.000000.
1 data packet(s) sent to host successfully.

on nagios server:

[1380671118] PASSIVE SERVICE CHECK: $host;xmpp-server2server;2;TCP CRITICAL - Invalid hostname, address or socket: -p

log file on server shows:[1380671164] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;$host;xmpp-server2server;0;TCP OK - 0.000 second response time on port 5269|time=0.000081s;;;0.000000;10.000000

so ... it looks like it's working, but the nagios server is telling me that I have an invalid hostname.

running: Nagios Core 3.4.4

scottwilkerson · Post by **scottwilkerson** » Wed Oct 02, 2013 9:20 am

It looks like you are passing $host literal to the command instead of passing the actual host name to the command

feralsboy · Post by **feralsboy** » Wed Oct 02, 2013 11:22 am

Hi Scott,

No ... I don't think that's it.

Here's some output:

we have a helper script: /usr/local/nagios_checks/nagios_common
contents of nagios_common:
#!/bin/bash
me=`hostname`
monitors=(10.64.1.6)
let "sleeptime=$RANDOM % 30"
function send_health (){
for monitor in ${monitors[@]}
do
sleep $sleeptime && echo $me,$svc,$status,$check | /usr/sbin/send_nsca -H $monitor -to 5 -d ',' -c /etc/nagios/send_nsca.cfg
done
}

plugin script:
cat /usr/local/nagios_checks/check_home-partition.sh
#!/bin/bash
. /usr/local/nagios_checks/nagios_common
svc='home-partition'
check=`/usr/lib64/nagios/plugins/check_disk -w 10% -c 5% -p /home`
status=$?
send_health

so when I run the helper script from the command line, I get:
+ . /usr/local/nagios_checks/nagios_common
+++ hostname
++ me=$host #changed for the public
++ monitors=(10.64.1.6)
++ let 'sleeptime=16361 % 30'
+ svc=home-partition
++ /usr/lib64/nagios/plugins/check_disk -w 10% -c 5% -p /home
+ check='DISK OK - free space: / 129424 MB (50% inode=99%);| /=129131MB;245153;258773;0;272393'
+ status=0
+ send_health
+ for monitor in '${monitors[@]}'
+ sleep 11
+ echo $host,home-partition,0,DISK OK - free space: / 129424 MB '(50%' 'inode=99%);|' '/=129131MB;245153;258773;0;272393'
+ /usr/sbin/send_nsca -H 10.64.1.6 -to 5 -d , -c /etc/nagios/send_nsca.cfg
1 data packet(s) sent to host successfully.

/etc/nagios/send_nsca.cfg
contains a password and encryption method.

scottwilkerson · Post by **scottwilkerson** » Wed Oct 02, 2013 2:20 pm

Ok, I see, you just manually put $host for privacy.

Do you have a host object definition for $host setup in Nagios?

feralsboy · Post by **feralsboy** » Wed Oct 02, 2013 7:53 pm

Ok,

I've made a little headway, but that only makes things stranger.
btw, headway means that I added another config/defination file
w/ the hostname in it.

Here's some strangeness:

we have two nagios servers. let's say the IP's 192.168.1.5 & 192.168.1.6

from a box w/ IP 192.168.10.1
- telnet 192.168.1.5 5667 works
- telnet 192.168.1.6 5667 fails w/ error: no route to host.

I thought it might be a problem w/ the nsca.cfg file ...
I used md5sum on both the nsca.cfg and the send_nsa.cfg files.

the nsca.cfg files are different. 1 has debuging set and one doesn't.

the send_nsca.cfg file from 192.168.1.5

md5sum /etc/nagios/send_nsca.cfg
b915f10720ce5034a0c2ef99ec44b743 /etc/nagios/send_nsca.cfg

the send_nsca.cfg file from a host I'm trying to monitor:
b915f10720ce5034a0c2ef99ec44b743 /etc/nagios/send_nsca.cfg

so ... they're the same file. Which should mean that they're using the
same password to reach the non-functional nagios machine. Which
would indicate that it's not a password issue.

I'm guessing that nsca is giving the "no route to host" error message.

slansing · Post by **slansing** » Thu Oct 03, 2013 10:04 am

Well if you can't reach the host address "192.168.1.6 5667" via telnet, nmap, etc.. there seems to be a issue with networking equipment between your core server and the remote host, or something blocking on the remote host's end, is port 5667 blocked? Are you running AV software? Etc.

Nagios Support Forum

passive service check & invalid hostname

passive service check & invalid hostname

Re: passive service check & invalid hostname

Re: passive service check & invalid hostname

Re: passive service check & invalid hostname

Re: passive service check & invalid hostname

Re: passive service check & invalid hostname