Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
I am working to setup nagios monitoring on remote host for TCP ports 5044, 9200 & 5601.
All these ports have containers running for elasticsearch, logstash & kibana respectively.
The problem here is that whenever I try to setup monitoring on the /usr/local/nagios/etc/servers hosts.cfg file on nagios UI it is not showing me "OK".
Below message is shown, so why it shows this message ? please point out my mistake or possible steps ?
(No output on stdout) stderr: Could not resolve hostname 13.233.122.181 -p 9200: Name or service not known
Whereas if I run the command manually I get response line below, which seems correct as all containers are running on respective ports.
[root@nagios-core libexec]# pwd
/usr/local/nagios/libexec
[root@nagios-core libexec]# ./check_tcp -H 13.233.122.181 -p 9200 && ./check_tcp -H 13.233.122.181 -p 5601
TCP OK - 0.001 second response time on 13.233.122.181 port 9200|time=0.001045s;;;0.000000;10.000000
TCP OK - 0.001 second response time on 13.233.122.181 port 5601|time=0.000808s;;;0.000000;10.000000
That seems odd.
Can you post the service definition as well as the command defined?
You may also want to run what's defined in the command definition and post that.
BTW, nagios usually runs commands as the nagios user, not root. This isn't likely to be an issue here, but could impact your testing in the future.
And the output of the following is what?
/usr/local/nagios/libexec/check_tcp -H 15.207.248.93 -p 5044
Your test is pointing to 13.233.122.181but the check is running on the address above so the testing is meaningless.
Please understand that EC2 instance was turned off and once it is on it will be assigned with new IP. Hence 15.207.248.93 is the new IP address. This IP address is replaced everywhere, including nrpe.cfg, and .cfg file under server directory.
[root@nagios-core ~]# /usr/local/nagios/libexec/check_tcp -H 15.207.248.93 -p 9200 && /usr/local/nagios/libexec/check_tcp -H 15.207.248.93 -p 5601 && /usr/local/nagios/libexec/check_tcp -H 15.207.248.93 -p 5666
TCP OK - 0.001 second response time on 15.207.248.93 port 9200|time=0.001498s;;;0.000000;10.000000
TCP OK - 0.001 second response time on 15.207.248.93 port 5601|time=0.001028s;;;0.000000;10.000000
TCP OK - 0.001 second response time on 15.207.248.93 port 5666|time=0.000913s;;;0.000000;10.000000
[root@nagios-core ~]#
It is not resolved. I am still getting below message on UI. Please help. I just showed you in my earlier reply that from command line this works correctly but on UI it shows error.
(No output on stdout) stderr: Could not resolve hostname 15.207.248.93 -p 9200: Name or service not known