Page 1 of 2
Pb check_init_service only on nagiosxi interface
Posted: Thu Dec 09, 2021 9:01 am
by nagiostpm
Hello,
I just proced to migrate nagiosxi 5.6.12 to 5.8.7.
Since i have pb with à specific check_nrpe (only check_init_service on all hosts, all the other check_nrpe are ok).
this is the command in nagiosxi interface :
check_nrpe!check_init_service!-a 'postgresql'
and the result :
CHECK_NRPE: Error - Could not connect to 10.0.11.71: Connection reset by peer
in fact, the check don't lauch at all with an older date :
Last Check: 12/01/2021 14:58:38
Next Check: 12/01/2021 15:03:37
There is no pb on nagios core interface ! (the day is good) :
Status Information: ● postgresql.service - PostgreSQL RDBMS
Loaded: loaded (/lib/systemd/system/postgresql.service; enabled; vendor preset: enabled)
Active: active (exited) since Thu 2020-06-11 09:46:52 CEST; 1 years 5 months ago
Main PID: 899 (code=exited, status=0/SUCCESS)
Tasks: 0 (limit: 4915)
Memory: 0B
CGroup: /system.slice/postgresql.service
Last Check Time: 12-09-2021 14:53:38
For exemple, on the same host this other check_nrpe is ok on nagiosxi interface :
check_nrpe!check_disk!-a '-w 10% -c 5% -p /'
DISK OK - free space: / 10547 MB (65,55% inode=83%):
Last Check: 12/09/2021 14:58:38
Next Check: 12/09/2021 15:03:38
thanks for your help
Re: Pb check_init_service only on nagiosxi interface
Posted: Thu Dec 09, 2021 4:09 pm
by pbroste
Hello
@nagiostpm
Thanks for reaching out,
Depending on how you have 'NRPE' service configured, please verify:
Code: Select all
systemctl <start,stop,status> nrpe
or
Code: Select all
systemctl <start,stop,status> xinetd
Want to verify that you can run the command standalone:
Code: Select all
/usr/local/nagios/libexec/check_init_service postgresql
Then run command from command line:
Code: Select all
/usr/local/nagios/libexec/check_nrpe -H localhost -t 30 -c check_init_service postgresql
or
Code: Select all
/usr/local/nagios/libexec/check_nrpe -n -H localhost -t 30 -c check_init_service postgresql
Since we are receiving the message: "Could not connect to..."; let's verify the connection test and make sure that listening on port 5666:
Thanks,
Perry
Please
Re: Pb check_init_service only on nagiosxi interface
Posted: Fri Dec 10, 2021 2:54 am
by nagiostpm
Hi,
It's a xinetd configuration :
Code: Select all
service nrpe
{
socket_type = stream
port = 5666
wait = no
user = nagios
group = nagios
server = /usr/local/nagios/bin/nrpe
server_args = -c /usr/local/nagios/etc/nrpe.cfg --inetd
only_from = <server nagios IP>
disable = no
log_on_success =
}
command standalone is ok :
Code: Select all
/usr/local/nagios/libexec/check_init_service postgresql
postgresql.service - PostgreSQL RDBMS
Loaded: loaded (/lib/systemd/system/postgresql.service; enabled; vendor preset: enabled)
Active: active (exited) since Fri 2021-09-03 10:37:44 CEST; 3 months 6 days ago
Process: 9368 ExecStart=/bin/true (code=exited, status=0/SUCCESS)
Main PID: 9368 (code=exited, status=0/SUCCESS)
command line via check_nrpe results is ok :
Code: Select all
./check_nrpe -H S-MUT-PGSQL13 -t 30 -c check_init_service -a postgresql
postgresql.service - PostgreSQL RDBMS
Loaded: loaded (/lib/systemd/system/postgresql.service; enabled; vendor preset: enabled)
Active: active (exited) since Fri 2021-09-03 10:37:44 CEST; 3 months 6 days ago
Process: 9368 ExecStart=/bin/true (code=exited, status=0/SUCCESS)
Main PID: 9368 (code=exited, status=0/SUCCESS)
and the same from nagiosxi server is ok too
The result is ok on nagios core interface too. Juste the nagiosxi interface is not updated and stay in old state with a non sense check date.
Connection test on port 5666:
Code: Select all
curl -v telnet://localhost:5666
* Expire in 0 ms for 6 (transfer 0x5641eb282fb0)
* Expire in 1 ms for 1 (transfer 0x5641eb282fb0)
* Expire in 0 ms for 1 (transfer 0x5641eb282fb0)
* Expire in 1 ms for 1 (transfer 0x5641eb282fb0)
* Expire in 0 ms for 1 (transfer 0x5641eb282fb0)
* Expire in 0 ms for 1 (transfer 0x5641eb282fb0)
* Expire in 1 ms for 1 (transfer 0x5641eb282fb0)
* Expire in 0 ms for 1 (transfer 0x5641eb282fb0)
* Expire in 0 ms for 1 (transfer 0x5641eb282fb0)
* Expire in 1 ms for 1 (transfer 0x5641eb282fb0)
* Expire in 0 ms for 1 (transfer 0x5641eb282fb0)
* Expire in 0 ms for 1 (transfer 0x5641eb282fb0)
* Expire in 1 ms for 1 (transfer 0x5641eb282fb0)
* Expire in 0 ms for 1 (transfer 0x5641eb282fb0)
* Expire in 0 ms for 1 (transfer 0x5641eb282fb0)
* Expire in 1 ms for 1 (transfer 0x5641eb282fb0)
* Expire in 0 ms for 1 (transfer 0x5641eb282fb0)
* Expire in 0 ms for 1 (transfer 0x5641eb282fb0)
* Expire in 1 ms for 1 (transfer 0x5641eb282fb0)
* Expire in 0 ms for 1 (transfer 0x5641eb282fb0)
* Expire in 0 ms for 1 (transfer 0x5641eb282fb0)
* Expire in 0 ms for 1 (transfer 0x5641eb282fb0)
* Trying ::1...
* TCP_NODELAY set
* Expire in 150000 ms for 3 (transfer 0x5641eb282fb0)
* Expire in 200 ms for 4 (transfer 0x5641eb282fb0)
* Connected to localhost (::1) port 5666 (#0)
* Closing connection 0
Code: Select all
sudo ss -tunlp | grep -E '5666'
tcp LISTEN 0 64 *:5666 *:* users:(("xinetd",pid=23854,fd=5))
Thanks
Re: Pb check_init_service only on nagiosxi interface
Posted: Fri Dec 10, 2021 4:09 pm
by pbroste
Hello
@nagiostpm
Thanks for following up, appears from the telnet connect it is connecting via ipv6, and want to set that to listen/establish over ipv4:
Add line in red:
# default: off
# description: NRPE (Nagios Remote Plugin Executor)
service nrpe
{
disable = no
per_source = 25
socket_type = stream
flags = IPv4
port = 5666
wait = no
user = nagios
group = nagios
server = /usr/local/nagios/bin/nrpe
server_args = -c /usr/local/nagios/etc/nrpe.cfg --inetd
only_from = 127.0.0.1 192.xxx.xxx.0/24
log_on_success =
}
Restart the xinetd service(s) and re-run the telnet command to verify:
Results from my test VM example:
* TCP_NODELAY set
* connect to ::1 port 5666 failed: Connection refused
* Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 5666 (#0)
Thanks,
Perry
Re: Pb check_init_service only on nagiosxi interface
Posted: Mon Dec 13, 2021 2:35 am
by nagiostpm
Hello,
same problem.
The check in nagiosxi interface is blocked on 12 january :
Last Check: 12/01/2021 14:58:38
Next Check: 12/01/2021 15:03:37
Alway good in nagios core interface (i upload captures)
All the others check nrpe are ok. This pb is only on check_nrpe!check_init_service
thanks
Re: Pb check_init_service only on nagiosxi interface
Posted: Mon Dec 13, 2021 4:00 am
by nagiostpm
After many test it seems nagiosxi interface don't work when there are specials characters or accents in the result of the command !
This result is in error in nagiosxi interface :
./check_nrpe -H TARTARE -n -2 -c check_log_neeva_import_absence
Fin de la proc▒dure leChaine presente
The same without the accent is ok in nagiosxi interface :
./check_nrpe -H TARTARE -n -2 -c check_log_neeva_import_absence
Fin de la procedure leChaine presente
Maybe there is a command to authorize specail characters ?
Re: Pb check_init_service only on nagiosxi interface
Posted: Mon Dec 13, 2021 4:49 pm
by pbroste
Hello
@nagiostpm
Thanks for digging into this and providing the details on the issue with special characters. You are correct that there are no character options to select from on the NRPE. There are ongoing updates to the NRPE agent for security, but the
NCPA agent will replace the NRPE agent going into the future.
I see from your example you are using
'check_log_neeva_import_absence' plugin. Is this a bash script, and want to find out if it is possible to write the script with different character input/output?
Thanks,
Perry
Re: Pb check_init_service only on nagiosxi interface
Posted: Tue Dec 14, 2021 2:04 am
by nagiostpm
Hi,
In fact, the real pb is not our scripts because i already deleted all the special characters of the results.
The problem is the check_init_service include in nagiosxi because the result contain specials characters. look an exemple :
./check_init_service httpd
Code: Select all
Redirecting to /bin/systemctl status httpd.service
● httpd.service - The Apache HTTP Server
Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled; vendor preset: disabled)
Active: active (running) since jeu. 2021-12-09 15:05:51 CET; 3 days ago
Docs: man:httpd(8)
man:apachectl(8)
Process: 24601 ExecStop=/bin/kill -WINCH ${MAINPID} (code=exited, status=0/SUCCESS)
Process: 17201 ExecReload=/usr/sbin/httpd $OPTIONS -k graceful (code=exited, status=0/SUCCESS)
Main PID: 24622 (httpd)
Status: "Total requests: 0; Current requests/sec: 0; Current traffic: 0 B/sec"
Tasks: 11
CGroup: /system.slice/httpd.service
The first character of this check is a special char ! (in fact it execute a simple systemctl status...).
This pb don't exist in my old nagiosxi version always in production (5.6.12 and centos 6). (The new one with the pb is 5.8.7 and RHEL 8.5). I update nrpe client to the lastest version with the same issue.
Maybe a solution could be create a new check_init_service2 which just give a result like this "Service status Ok", instead of the all result of the command systemctl status ?
Or another one is to use the check_procs instead check_init_service. But it seems to me this is not normal that the check_init_service don't work anymore in nagiosxi 5.8.
thanks
Re: Pb check_init_service only on nagiosxi interface
Posted: Tue Dec 14, 2021 3:02 pm
by pbroste
Hello
@nagiostpm
Thanks for following up, here is another service check bash script option that you can go ahead and add to your _Commands to create your service check.
Code: Select all
#!/bin/bash
# Nagios Plugin Bash Script - check_service.sh
# This script checks if program is running
# Check for missing parameters
if [[ -z "$1" ]]
then
echo "Missing parameters! Syntax: ./check_service.sh service_name"
exit 3
fi
if ps ax | grep -v grep | grep $1 > /dev/null
then
echo "OK, $SERVICE service is running"
exit 0
else
echo "CRITICAL , $SERVICE service is not running"
exit 2
fi
Please let us know how this works for you,
Perry
Re: Pb check_init_service only on nagiosxi interface
Posted: Wed Dec 15, 2021 1:42 am
by nagiostpm
hello,
It's ok with this script. Problem resolved.
thanks