Page 1 of 1

passive service check & invalid hostname

Posted: Tue Oct 01, 2013 6:52 pm
by feralsboy
Hello,

Something odd is going on,and I'm hoping to get a little help.

I'm trying to run some passive checks on some linux boxes and getting errors that don't make sense.

error: PASSIVE SERVICE CHECK: $host;xmpp-bosh;2;TCP CRITICAL - Invalid hostname, address or socket: -p

things tried:

ping $host works.

main configuration file contains: cfg_file=/etc/nagios/$configdir/$newconfig.cfg

grep $host cfg_file gives:

host_name $host


$host:/usr/local/nagios_checks/$script
prints to the screen:

xmpp-server2server Port Status on 5269 is = TCP OK - 0.000 second response time on port 5269|time=0.000081s;;;0.000000;10.000000.
1 data packet(s) sent to host successfully.

on nagios server:

[1380671118] PASSIVE SERVICE CHECK: $host;xmpp-server2server;2;TCP CRITICAL - Invalid hostname, address or socket: -p

log file on server shows:[1380671164] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;$host;xmpp-server2server;0;TCP OK - 0.000 second response time on port 5269|time=0.000081s;;;0.000000;10.000000

so ... it looks like it's working, but the nagios server is telling me that I have an invalid hostname.


running: Nagios Core 3.4.4

Re: passive service check & invalid hostname

Posted: Wed Oct 02, 2013 9:20 am
by scottwilkerson
It looks like you are passing $host literal to the command instead of passing the actual host name to the command

Re: passive service check & invalid hostname

Posted: Wed Oct 02, 2013 11:22 am
by feralsboy
Hi Scott,

No ... I don't think that's it.

Here's some output:

we have a helper script: /usr/local/nagios_checks/nagios_common
contents of nagios_common:
#!/bin/bash
me=`hostname`
monitors=(10.64.1.6)
let "sleeptime=$RANDOM % 30"
function send_health (){
for monitor in ${monitors[@]}
do
sleep $sleeptime && echo $me,$svc,$status,$check | /usr/sbin/send_nsca -H $monitor -to 5 -d ',' -c /etc/nagios/send_nsca.cfg
done
}

plugin script:
cat /usr/local/nagios_checks/check_home-partition.sh
#!/bin/bash
. /usr/local/nagios_checks/nagios_common
svc='home-partition'
check=`/usr/lib64/nagios/plugins/check_disk -w 10% -c 5% -p /home`
status=$?
send_health

so when I run the helper script from the command line, I get:
+ . /usr/local/nagios_checks/nagios_common
+++ hostname
++ me=$host #changed for the public
++ monitors=(10.64.1.6)
++ let 'sleeptime=16361 % 30'
+ svc=home-partition
++ /usr/lib64/nagios/plugins/check_disk -w 10% -c 5% -p /home
+ check='DISK OK - free space: / 129424 MB (50% inode=99%);| /=129131MB;245153;258773;0;272393'
+ status=0
+ send_health
+ for monitor in '${monitors[@]}'
+ sleep 11
+ echo $host,home-partition,0,DISK OK - free space: / 129424 MB '(50%' 'inode=99%);|' '/=129131MB;245153;258773;0;272393'
+ /usr/sbin/send_nsca -H 10.64.1.6 -to 5 -d , -c /etc/nagios/send_nsca.cfg
1 data packet(s) sent to host successfully.

/etc/nagios/send_nsca.cfg
contains a password and encryption method.

Re: passive service check & invalid hostname

Posted: Wed Oct 02, 2013 2:20 pm
by scottwilkerson
Ok, I see, you just manually put $host for privacy.

Do you have a host object definition for $host setup in Nagios?

Re: passive service check & invalid hostname

Posted: Wed Oct 02, 2013 7:53 pm
by feralsboy
Ok,

I've made a little headway, but that only makes things stranger.
btw, headway means that I added another config/defination file
w/ the hostname in it.

Here's some strangeness:

we have two nagios servers. let's say the IP's 192.168.1.5 & 192.168.1.6

from a box w/ IP 192.168.10.1
- telnet 192.168.1.5 5667 works
- telnet 192.168.1.6 5667 fails w/ error: no route to host.

I thought it might be a problem w/ the nsca.cfg file ...
I used md5sum on both the nsca.cfg and the send_nsa.cfg files.

the nsca.cfg files are different. 1 has debuging set and one doesn't.

the send_nsca.cfg file from 192.168.1.5

md5sum /etc/nagios/send_nsca.cfg
b915f10720ce5034a0c2ef99ec44b743 /etc/nagios/send_nsca.cfg

the send_nsca.cfg file from a host I'm trying to monitor:
b915f10720ce5034a0c2ef99ec44b743 /etc/nagios/send_nsca.cfg

so ... they're the same file. Which should mean that they're using the
same password to reach the non-functional nagios machine. Which
would indicate that it's not a password issue.

I'm guessing that nsca is giving the "no route to host" error message.

Re: passive service check & invalid hostname

Posted: Thu Oct 03, 2013 10:04 am
by slansing
Well if you can't reach the host address "192.168.1.6 5667" via telnet, nmap, etc.. there seems to be a issue with networking equipment between your core server and the remote host, or something blocking on the remote host's end, is port 5667 blocked? Are you running AV software? Etc.