nrpe - 3.2.1 service file issues

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
User avatar
jenglish
Posts: 59
Joined: Sat Jun 09, 2018 3:51 pm
Location: Boyers, PA

nrpe - 3.2.1 service file issues

Post by jenglish »

Hello,

We recently updated our nrpe agents to 3.2.1 and it caused some issues with starting/restarting the service. We were able to fix this by using the old service file configurations from 3.2.0. Is this a bug? Please advise.

nrpe 3.2.0 service file:

Code: Select all

[Unit]
Description=Nagios Remote Program Executor
Documentation=http://www.nagios.org/documentation
Conflicts=nrpe.socket
Requires=network.target

[Install]
WantedBy=multi-user.target

[Service]
Type=forking
User=nrpe
Group=nrpe
EnvironmentFile=/etc/sysconfig/nrpe
ExecStart=/usr/sbin/nrpe -c /etc/nagios/nrpe.cfg -d $NRPE_SSL_OPT
nrpe 3.2.1 service file:

Code: Select all

[Unit]
Description=Nagios Remote Program Executor
Documentation=http://www.nagios.org/documentation
Conflicts=nrpe.socket
Requires=network-online.target
After=var-run.mount nss-lookup.target network.target local-fs.target time-sync.target
[email protected] xdm.service

[Install]
WantedBy=multi-user.target

[Service]
Type=forking
User=nrpe
Group=nrpe
EnvironmentFile=/etc/sysconfig/nrpe
ExecStart=/usr/sbin/nrpe -c /etc/nagios/nrpe.cfg -d $NRPE_SSL_OPT
ExecReload=/bin/kill -HUP $MAINPID
ExecStopPost=/bin/rm -f /var/run/nrpe/nrpe.pid
PIDFile=/var/run/nrpe/nrpe.pid
Log file snip from nrpe 3.2.1:

Code: Select all

Oct 16 07:15:48 boy-oraem01.opm.gov systemd[1]: Starting Nagios Remote Program Executor...
Oct 16 07:15:48 boy-oraem01.opm.gov systemd[1]: PID file /var/run/nrpe/nrpe.pid not readable (yet?) after start.
Oct 16 07:15:48 boy-oraem01.opm.gov nrpe[64219]: Starting up daemon
Oct 16 07:15:48 boy-oraem01.opm.gov systemd[1]: nrpe.service never wrote its PID file. Failing.
Oct 16 07:15:48 boy-oraem01.opm.gov systemd[1]: Failed to start Nagios Remote Program Executor.
Oct 16 07:15:48 boy-oraem01.opm.gov systemd[1]: Unit nrpe.service entered failed state.
Oct 16 07:15:48 boy-oraem01.opm.gov systemd[1]: nrpe.service failed.

Job for nrpe.service failed because a configured resource limit was exceeded. See "systemctl status nrpe.service" and "journalctl -xe" for details.

OS version:

Code: Select all

Red Hat Enterprise Linux Server release 7.5 (Maipo)
3.10.0-862.6.3.el7.x86_64
Thank you!
Jordan
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: nrpe - 3.2.1 service file issues

Post by lmiltchev »

We recently updated our nrpe agents to 3.2.1 and it caused some issues with starting/restarting the service.
Can you describe in details what were the steps you took to upgrade NRPE? What document/guide/tutorial did you follow?

Can you show the output of the following commands (when nrpe fails to start)?

Code: Select all

systemctl status nrpe.service
journalctl -xe
Be sure to check out our Knowledgebase for helpful articles and solutions!
User avatar
jenglish
Posts: 59
Joined: Sat Jun 09, 2018 3:51 pm
Location: Boyers, PA

Re: nrpe - 3.2.1 service file issues

Post by jenglish »

@lmitchev

We used the EPEL repository to update nrpe using yum. e.g. "yum update rnpe"

Code: Select all

(PRO-BOY|jenglish@boy-oraem01 ~)$ systemctl status nrpe
● nrpe.service - Nagios Remote Program Executor
   Loaded: loaded (/usr/lib/systemd/system/nrpe.service; enabled; vendor preset: disabled)
   Active: failed (Result: resources) since Tue 2018-10-16 10:45:48 EDT; 21min ago
     Docs: http://www.nagios.org/documentation
  Process: 54990 ExecStart=/usr/sbin/nrpe -c /etc/nagios/nrpe.cfg -d $NRPE_SSL_OPT (code=exited, status=0/SUCCESS)
 Main PID: 37390 (code=exited, status=0/SUCCESS)

Oct 16 10:45:48 boy-oraem01.opm.gov systemd[1]: Starting Nagios Remote Program Executor...
Oct 16 10:45:48 boy-oraem01.opm.gov nrpe[54990]: Added command[check_users]=/usr/lib64/nagios/...2$
Oct 16 10:45:48 boy-oraem01.opm.gov nrpe[54990]: Added command[check_load]=/usr/lib64/nagios/p...2$
Oct 16 10:45:48 boy-oraem01.opm.gov nrpe[54990]: Added command[check_disk]=/usr/lib64/nagios/p...3$
Oct 16 10:45:48 boy-oraem01.opm.gov systemd[1]: PID file /var/run/nrpe/nrpe.pid not readable ...rt.
Oct 16 10:45:48 boy-oraem01.opm.gov nrpe[54995]: Starting up daemon
Oct 16 10:45:48 boy-oraem01.opm.gov systemd[1]: nrpe.service never wrote its PID file. Failing.
Oct 16 10:45:48 boy-oraem01.opm.gov systemd[1]: Failed to start Nagios Remote Program Executor.
Oct 16 10:45:48 boy-oraem01.opm.gov systemd[1]: Unit nrpe.service entered failed state.
Oct 16 10:45:48 boy-oraem01.opm.gov systemd[1]: nrpe.service failed.
Hint: Some lines were ellipsized, use -l to show in full.
check for logs since 11:00 - restart service fails - check logs again:

Code: Select all

(PRO-BOY|jenglish@boy-oraem01 ~)$ journalctl -u nrpe -S 11:00
-- No entries --
(PRO-BOY|jenglish@boy-oraem01 ~)$ sudo systemctl restart nrpe
Job for nrpe.service failed because a configured resource limit was exceeded. See "systemctl status nrpe.service" and "journalctl -xe" for details.
(PRO-BOY|jenglish@boy-oraem01 ~)$ journalctl -u nrpe -S 11:00
-- Logs begin at Sat 2018-10-13 04:00:00 EDT, end at Tue 2018-10-16 11:08:33 EDT. --
Oct 16 11:08:33 boy-oraem01.opm.gov systemd[1]: Starting Nagios Remote Program Executor...
Oct 16 11:08:33 boy-oraem01.opm.gov nrpe[123230]: Added command[check_users]=/usr/lib64/nagios/plug
Oct 16 11:08:33 boy-oraem01.opm.gov nrpe[123230]: Added command[check_load]=/usr/lib64/nagios/plugi
Oct 16 11:08:33 boy-oraem01.opm.gov nrpe[123230]: Added command[check_disk]=/usr/lib64/nagios/plugi
Oct 16 11:08:33 boy-oraem01.opm.gov nrpe[123230]: Added command[check_temp]=/usr/lib64/nagios/plugi
Oct 16 11:08:33 boy-oraem01.opm.gov nrpe[123230]: Added command[check_procs]=/usr/lib64/nagios/plug
Oct 16 11:08:33 boy-oraem01.opm.gov nrpe[123230]: Added command[check_lock_age]=/usr/lib64/nagios/p
Oct 16 11:08:33 boy-oraem01.opm.gov nrpe[123230]: Added command[check_ntp_time]=/usr/lib64/nagios/p
Oct 16 11:08:33 boy-oraem01.opm.gov nrpe[123230]: Added command[check_file_age]=sudo /usr/lib64/nag
Oct 16 11:08:33 boy-oraem01.opm.gov nrpe[123230]: Added command[check_init]=/usr/lib64/nagios/plugi
Oct 16 11:08:33 boy-oraem01.opm.gov nrpe[123230]: Added command[check_swap]=/usr/lib64/nagios/plugi
Oct 16 11:08:33 boy-oraem01.opm.gov nrpe[123230]: Added command[check_generic]=/usr/lib64/nagios/pl
Oct 16 11:08:33 boy-oraem01.opm.gov nrpe[123230]: Added command[check_tcp]=/usr/lib64/nagios/plugin
Oct 16 11:08:33 boy-oraem01.opm.gov systemd[1]: PID file /var/run/nrpe/nrpe.pid not readable (yet?)
Oct 16 11:08:33 boy-oraem01.opm.gov systemd[1]: nrpe.service never wrote its PID file. Failing.
Oct 16 11:08:33 boy-oraem01.opm.gov systemd[1]: Failed to start Nagios Remote Program Executor.
Oct 16 11:08:33 boy-oraem01.opm.gov systemd[1]: Unit nrpe.service entered failed state.
Oct 16 11:08:33 boy-oraem01.opm.gov systemd[1]: nrpe.service failed.

User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: nrpe - 3.2.1 service file issues

Post by lmiltchev »

We were able to fix this by using the old service file configurations from 3.2.0. Is this a bug? Please advise.
It's possible that this is a bug. Our developers will be looking into this. Please report the issue here:

https://bugzilla.redhat.com/
Oct 16 10:45:48 boy-oraem01.opm.gov systemd[1]: nrpe.service never wrote its PID file. Failing.
Where is nrpe.pid located on your system? Is it in the "/var/run/nrpe" directory?
Be sure to check out our Knowledgebase for helpful articles and solutions!
User avatar
jenglish
Posts: 59
Joined: Sat Jun 09, 2018 3:51 pm
Location: Boyers, PA

Re: nrpe - 3.2.1 service file issues

Post by jenglish »

No PID created:

Code: Select all

(PRO-BOY|jenglish@boy-oraem01 ~)$ sudo find / -name nrpe.pid
(PRO-BOY|jenglish@boy-oraem01 ~)$
I've never entered a bug before. Which would be most applicable? RedHat or Other?
Capture.PNG
You do not have the required permissions to view the files attached to this post.
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: nrpe - 3.2.1 service file issues

Post by lmiltchev »

No PID created:
You were not able to start NRPE (with the new config), that's why the PID file was not created. Try switching to the old config, so that you can start NRPE successfully. After this, try finding the PID again. We need to see the location, and permissions of the directory (where the PID is located).
I've never entered a bug before. Which would be most applicable? RedHat or Other?
You need to select:

Product: Fedora EPEL
Version: epel7
Hardware: x86_64 Linux
Be sure to check out our Knowledgebase for helpful articles and solutions!
User avatar
jenglish
Posts: 59
Joined: Sat Jun 09, 2018 3:51 pm
Location: Boyers, PA

Re: nrpe - 3.2.1 service file issues

Post by jenglish »

No PID file here either:

Code: Select all

(PRO-BOY|jenglish@boy-adams2 ~)$ uname -r ; cat /etc/redhat-release ; sudo rpm -qa | grep nrpe
3.10.0-862.6.3.el7.x86_64
Red Hat Enterprise Linux Server release 7.5 (Maipo)
nagios-plugins-nrpe-3.2.1-6.el7.x86_64
nrpe-3.2.1-6.el7.x86_64
(PRO-BOY|jenglish@boy-adams2 ~)$ sudo systemctl cat nrpe
# /usr/lib/systemd/system/nrpe.service
[Unit]
Description=Nagios Remote Program Executor
Documentation=http://www.nagios.org/documentation
Conflicts=nrpe.socket
Requires=network.target

[Install]
WantedBy=multi-user.target

[Service]
Type=forking
User=nrpe
Group=nrpe
EnvironmentFile=/etc/sysconfig/nrpe
ExecStart=/usr/sbin/nrpe -c /etc/nagios/nrpe.cfg -d $NRPE_SSL_OPT
(PRO-BOY|jenglish@boy-adams2 ~)$ sudo systemctl restart nrpe
(PRO-BOY|jenglish@boy-adams2 ~)$ sudo find / -name nrpe.pid
(PRO-BOY|jenglish@boy-adams2 ~)$
(PRO-BOY|jenglish@boy-adams2 ~)$ sudo systemctl is-active nrpe
active
(PRO-BOY|jenglish@boy-adams2 ~)$ ps aux | grep nrpe
nrpe       1227  0.0  0.0  44884  1440 ?        Ss   12:28   0:00 /usr/sbin/nrpe -c /etc/nagios/nrpe.cfg -d
jenglish   1310  0.0  0.0 112704   980 pts/0    S+   12:29   0:00 grep --color=auto nrpe
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: nrpe - 3.2.1 service file issues

Post by lmiltchev »

Do you see any nrpe related errors in the /var/log/messages?

Can you show the output of the following command?

Code: Select all

ls -lad /var/run/nrpe/
Be sure to check out our Knowledgebase for helpful articles and solutions!
User avatar
jenglish
Posts: 59
Joined: Sat Jun 09, 2018 3:51 pm
Location: Boyers, PA

Re: nrpe - 3.2.1 service file issues

Post by jenglish »

I see more info in journald, but here is the output from messages:

Code: Select all

(PRO-BOY|jenglish@boy-oraem01 ~)$ sudo grep -i nrpe /var/log/messages | grep 12:
Oct 16 12:15:51 boy-oraem01 systemd: PID file /var/run/nrpe/nrpe.pid not readable (yet?) after start.
Oct 16 12:15:51 boy-oraem01 systemd: nrpe.service never wrote its PID file. Failing.
Oct 16 12:15:51 boy-oraem01 systemd: Unit nrpe.service entered failed state.
Oct 16 12:15:51 boy-oraem01 systemd: nrpe.service failed.
Oct 16 12:45:48 boy-oraem01 systemd: PID file /var/run/nrpe/nrpe.pid not readable (yet?) after start.
Oct 16 12:45:48 boy-oraem01 systemd: nrpe.service never wrote its PID file. Failing.
Oct 16 12:45:48 boy-oraem01 systemd: Unit nrpe.service entered failed state.
Oct 16 12:45:48 boy-oraem01 systemd: nrpe.service failed.
output:

Code: Select all

(PRO-BOY|jenglish@boy-oraem01 ~)$ ls -lad /var/run/nrpe/
drwxrwxr-x. 2 nrpe nrpe 40 Jul 24 18:37 /var/run/nrpe/
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: nrpe - 3.2.1 service file issues

Post by lmiltchev »

I believe I know what happened. The nrpe.cfg usually doesn't get updated on the upgrade of NRPE, so if you had a different path to the nrpe.pid, specified in the nrpe.cfg file, then starting NRPE would fail.

Can you double check what you have in the nrpe.cfg file

Code: Select all

pid_file=
and make sure it matches the paths in the init file:

Code: Select all

ExecStopPost=/bin/rm -f /var/run/nrpe/nrpe.pid
PIDFile=/var/run/nrpe/nrpe.pid
Once you make these identical, and reload the config, NRPE should start fine (with the "new" config).

Let us know if this resolved your issue.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked