Page 1 of 2

Nagios check_tcp not showing correct status on UI

Posted: Sun Jun 15, 2025 9:14 am
by pnikhade
Hi All,

I am working to setup nagios monitoring on remote host for TCP ports 5044, 9200 & 5601.
All these ports have containers running for elasticsearch, logstash & kibana respectively.

The problem here is that whenever I try to setup monitoring on the /usr/local/nagios/etc/servers hosts.cfg file on nagios UI it is not showing me "OK".

Below message is shown, so why it shows this message ? please point out my mistake or possible steps ?
(No output on stdout) stderr: Could not resolve hostname 13.233.122.181 -p 9200: Name or service not known
Whereas if I run the command manually I get response line below, which seems correct as all containers are running on respective ports.

Code: Select all

[root@nagios-core libexec]# pwd
/usr/local/nagios/libexec
[root@nagios-core libexec]# ./check_tcp -H 13.233.122.181 -p 9200 && ./check_tcp -H 13.233.122.181 -p 5601
TCP OK - 0.001 second response time on 13.233.122.181 port 9200|time=0.001045s;;;0.000000;10.000000
TCP OK - 0.001 second response time on 13.233.122.181 port 5601|time=0.000808s;;;0.000000;10.000000

Re: Nagios check_tcp not showing correct status on UI

Posted: Sun Jun 15, 2025 9:32 pm
by kg2857
That seems odd.
Can you post the service definition as well as the command defined?
You may also want to run what's defined in the command definition and post that.
BTW, nagios usually runs commands as the nagios user, not root. This isn't likely to be an issue here, but could impact your testing in the future.

Re: Nagios check_tcp not showing correct status on UI

Posted: Sun Jun 15, 2025 10:11 pm
by pnikhade
Please check the below,

Service defination defined on nagios server,

Path - /usr/local/nagios/etc/servers/ELK-stack.cfg

Code: Select all

define host {
        use                             linux-server
        host_name                       ELK-Stack
        alias                           My client server
        address                         15.207.248.93
        max_check_attempts              20
        check_period                    24x7
        notification_interval           1
        notification_period             24x7
}

define service {
        use                             generic-service
        host_name                       ELK-Stack
        service_description             CPU Load
        check_interval                  2
        retry_interval                  1
        check_command                   check_nrpe!check_load -a '-w .15,.10,.05 -c .30,.25,.20'
}

define service {
           use                             generic-service
           host_name                       ELK-Stack
           service_description             NPRE service
           check_interval                  1
           retry_interval                  1
           check_command                   check_nrpe!check_kibana '-H 15.207.248.93 -p 5666'
}

define service {
           use                             generic-service
           host_name                       ELK-Stack
           service_description             Elasticsearch service
           check_interval                  1
           retry_interval                  1
           check_command                   check_nrpe!check_elasticsearch '-H 15.207.248.93 -p 9200'
}

define service {
           use                             generic-service
           host_name                       ELK-Stack
           service_description             Kibana service
           check_interval                  1
           retry_interval                  1
           check_command                   check_nrpe!check_kibana '-H 15.207.248.93 -p 5061'
}

define service {
           use                             generic-service
           host_name                       ELK-Stack
           service_description             Logstash service
           check_interval                  1
           retry_interval                  1
           check_command                   check_nrpe!check_logstash '-H 15.207.248.93 -p 5044'
}
commands defined on client nrpe.cfg file,

Path - /usr/local/nagios/etc/nrpe.cfg

Code: Select all

command[check_logstash]=/usr/local/nagios/libexec/check_tcp $ARG1$
command[check_kibana]=/usr/local/nagios/libexec/check_tcp $ARG1$
command[check_disk]=/usr/local/nagios/libexec/check_disk $ARG1$

Re: Nagios check_tcp not showing correct status on UI

Posted: Sun Jun 15, 2025 10:30 pm
by kg2857
And the output of the following is what?
/usr/local/nagios/libexec/check_tcp -H 15.207.248.93 -p 5044
Your test is pointing to 13.233.122.181but the check is running on the address above so the testing is meaningless.

Re: Nagios check_tcp not showing correct status on UI

Posted: Sun Jun 15, 2025 10:56 pm
by pnikhade
Please understand that EC2 instance was turned off and once it is on it will be assigned with new IP. Hence 15.207.248.93 is the new IP address. This IP address is replaced everywhere, including nrpe.cfg, and .cfg file under server directory.

Code: Select all

[root@nagios-core ~]# /usr/local/nagios/libexec/check_tcp -H 15.207.248.93 -p 9200 && /usr/local/nagios/libexec/check_tcp -H 15.207.248.93 -p 5601 && /usr/local/nagios/libexec/check_tcp -H 15.207.248.93 -p 5666
TCP OK - 0.001 second response time on 15.207.248.93 port 9200|time=0.001498s;;;0.000000;10.000000
TCP OK - 0.001 second response time on 15.207.248.93 port 5601|time=0.001028s;;;0.000000;10.000000
TCP OK - 0.001 second response time on 15.207.248.93 port 5666|time=0.000913s;;;0.000000;10.000000
[root@nagios-core ~]#

Re: Nagios check_tcp not showing correct status on UI

Posted: Sun Jun 15, 2025 11:09 pm
by kg2857
You aren't testing as the service is defined.
Glad the issue is resolved.

Re: Nagios check_tcp not showing correct status on UI

Posted: Sun Jun 15, 2025 11:22 pm
by pnikhade
It is not resolved. I am still getting below message on UI. Please help. I just showed you in my earlier reply that from command line this works correctly but on UI it shows error.

(No output on stdout) stderr: Could not resolve hostname 15.207.248.93 -p 9200: Name or service not known

Re: Nagios check_tcp not showing correct status on UI

Posted: Sun Jun 15, 2025 11:40 pm
by kg2857
All I can think of is to remove the single quotes in the service since the system is trying to resolve '15.207.248.93 -p 9200' as a hostname.

Re: Nagios check_tcp not showing correct status on UI

Posted: Sun Jun 15, 2025 11:56 pm
by pnikhade
The single quotes are basically taking arguments with "-a" flag. So not sure it is an incorrect syntax ?

Anyways I tried that as well, UI shows like below,
Usage:

Re: Nagios check_tcp not showing correct status on UI

Posted: Mon Jun 16, 2025 12:11 am
by kg2857
Try removing the check_nrpe from the service.

command[check_logstash]=/usr/local/nagios/libexec/check_tcp $ARG1$

define service {
use generic-service
host_name ELK-Stack
service_description Logstash service
check_interval 1
retry_interval 1
check_command check_logstash '-H 15.207.248.93 -p 5044'
}