Page 1 of 2

Nagios 4.2.2 + Plugins 2.1.3 + NRPE v3 on CentOS 7

Posted: Mon Nov 14, 2016 7:55 am
by wice22
Hi, Trying to install Nagios 4.2.2 + Plugins 2.1.3 + NRPE v3

# cat /etc/redhat-release
CentOS Linux release 7.2.1511 (Core) Minimal Fresh install
# sestatus
SELinux status: disabled

Following your instructions here : https://support.nagios.com/kb/article.p ... ategory=58
and
here: https://support.nagios.com/kb/article.php?id=515

Distributions:
Nagios: https://github.com/NagiosEnterprises/na ... 2.2.tar.gz
Nagios Plugins: https://github.com/nagios-plugins/nagio ... 1.3.tar.gz
NRPE https://github.com/NagiosEnterprises/nr ... 3.0.tar.gz

After installations is completed I'm testing with :
netstat -at | grep nrpe
/usr/local/nagios/libexec/check_nrpe -H 127.0.0.1
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

Everything is working OK, no errors detected, Tests are OK, Web interface works fine, monitors OK.

BUT after hard reboot of the server I'm getting the following :

WEB Interface: Error: Could not read object configuration data!

#systemctl restart nagios.service

Code: Select all

Job for nagios.service failed because the control process exited with error code. See "systemctl status nagios.service" and "journalctl -xe" for details.
#/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

Code: Select all

Nagios Core 4.2.2
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 10-24-2016
License: GPL

Website: https://www.nagios.org
Reading configuration data...
Error in configuration file '/usr/local/nagios/etc/nagios.cfg' - Line 452 (Check result path '/usr/local/nagios/var/spool/checkresults' is not a valid directory)
   Error processing main config file!
Not sure why that should happen but ok,

I decided to create directory /usr/local/nagios/var/spool/checkresults sins does not exist

# install -d -m 744 -o nagios -g nagios /usr/local/nagios/var/spool/checkresults

# ls -ld /usr/local/nagios/var/spool/checkresults

Code: Select all

drwxr--r-- 2 nagios nagios 6 Nov 14 14:52 /usr/local/nagios/var/spool/checkresults
# systemctl restart nrpe.service
# systemctl restart nagios.service
# systemctl restart httpd

Now strange thing is happening, Nagios kind of starting ... or at list do not shows any issues when restarting:

# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

Code: Select all

Nagios Core 4.2.2
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 10-24-2016
License: GPL

Website: https://www.nagios.org
Reading configuration data...
   Read main config file okay...
   Read object config files okay...

Running pre-flight check on configuration data...

Checking objects...
        Checked 8 services.
        Checked 1 hosts.
        Checked 1 host groups.
        Checked 0 service groups.
        Checked 1 contacts.
        Checked 1 contact groups.
        Checked 24 commands.
        Checked 5 time periods.
        Checked 0 host escalations.
        Checked 0 service escalations.
Checking for circular paths...
        Checked 1 hosts
        Checked 0 service dependencies
        Checked 0 host dependencies
        Checked 5 timeperiods
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...

Total Warnings: 0
Total Errors:   0

Things look okay - No serious problems were detected during the pre-flight check
and

# systemctl status nagios

Code: Select all

● nagios.service - LSB: Starts and stops the Nagios monitoring server
   Loaded: loaded (/etc/rc.d/init.d/nagios)
   Active: active (exited) since Mon 2016-11-14 14:02:47 GMT; 24min ago
     Docs: man:systemd-sysv-generator(8)
  Process: 2533 ExecStop=/etc/rc.d/init.d/nagios stop (code=exited, status=0/SUCCESS)
  Process: 2538 ExecStart=/etc/rc.d/init.d/nagios start (code=exited, status=0/SUCCESS)

Nov 14 14:02:47 localhost.localdomain systemd[1]: Starting LSB: Starts and stops the Nagios monitoring server...
Nov 14 14:02:47 localhost.localdomain nagios[2538]: Starting nagios: done.
Nov 14 14:02:47 localhost.localdomain systemd[1]: Started LSB: Starts and stops the Nagios monitoring server.
and

# /usr/local/nagios/libexec/check_nrpe -H 127.0.0.1

Code: Select all

NRPE vnrpe-3.0

BUT
# journalctl -xe
Nov 14 14:02:47 localhost.localdomain nagios[2558]: Nagios 4.2.2 starting... (PID=2558)
Nov 14 14:02:47 localhost.localdomain nagios[2558]: Local time is Mon Nov 14 14:02:47 GMT 2016
Nov 14 14:02:47 localhost.localdomain nagios[2558]: LOG VERSION: 2.0
Nov 14 14:02:47 localhost.localdomain nagios[2558]: qh: Failed to init socket '/usr/local/nagios/var/rw/nagios.qh'. bind() failed: No such file or directory
Nov 14 14:02:47 localhost.localdomain nagios[2558]: Error: Failed to initialize query handler. Aborting

and Web interface obviously is down too
WEB Interface: Error: Could not read object configuration data!

Could you please clarify what is the issue, sins I'm following installation guide and using Latest available distribution ?

Re: Nagios 4.2.2 + Plugins 2.1.3 + NRPE v3 on CentOS 7

Posted: Mon Nov 14, 2016 11:18 am
by dwhitfield
Could you post the output of history, scrubbing any credentials?

Re: Nagios 4.2.2 + Plugins 2.1.3 + NRPE v3 on CentOS 7

Posted: Mon Nov 14, 2016 12:15 pm
by rkennedy
To add to what @dwhitfield mentioned - what guides did you follow for the installs?

Re: Nagios 4.2.2 + Plugins 2.1.3 + NRPE v3 on CentOS 7

Posted: Mon Nov 14, 2016 5:48 pm
by wice22
rkennedy wrote:To add to what @dwhitfield mentioned - what guides did you follow for the installs?
As I mentioned in Problem description I did follow :

For core installation I followed instructions here : https://support.nagios.com/kb/article.p ... ategory=58
and
For NRPE 3 from here: https://support.nagios.com/kb/article.php?id=515

dwhitfield wrote:Could you post the output of history, scrubbing any credentials?
Re installed ones again , to make sure not doing some minor mistake somewhere, here is the result before restart:

Code: Select all

# /usr/local/nagios/libexec/check_nrpe -H 127.0.0.1
NRPE vnrpe-3.0

Code: Select all

# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

Nagios Core 4.2.2
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 10-24-2016
License: GPL

Website: https://www.nagios.org
Reading configuration data...

   Read main config file okay...
   Read object config files okay...

Running pre-flight check on configuration data...

Checking objects...
        Checked 8 services.
        Checked 1 hosts.
        Checked 1 host groups.
        Checked 0 service groups.
        Checked 1 contacts.
        Checked 1 contact groups.
        Checked 24 commands.
        Checked 5 time periods.
        Checked 0 host escalations.
        Checked 0 service escalations.
Checking for circular paths...
        Checked 1 hosts
        Checked 0 service dependencies
        Checked 0 host dependencies
        Checked 5 timeperiods
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...

Total Warnings: 0
Total Errors:   0

Things look okay - No serious problems were detected during the pre-flight check
journalctl -xe

Code: Select all

-- The start-up result is done.
Nov 14 23:34:45 localhost.localdomain nagios[30692]: Nagios 4.2.2 starting... (PID=30692)
Nov 14 23:34:45 localhost.localdomain nagios[30692]: Local time is Mon Nov 14 23:34:45 GMT 2016
Nov 14 23:34:45 localhost.localdomain nagios[30692]: LOG VERSION: 2.0
Nov 14 23:34:45 localhost.localdomain nagios[30692]: qh: Socket '/usr/local/nagios/var/rw/nagios.qh' successfully initialized
Nov 14 23:34:45 localhost.localdomain nagios[30692]: qh: core query handler registered
Nov 14 23:34:45 localhost.localdomain nagios[30692]: nerd: Channel hostchecks registered successfully
Nov 14 23:34:45 localhost.localdomain nagios[30692]: nerd: Channel servicechecks registered successfully
Nov 14 23:34:45 localhost.localdomain nagios[30692]: nerd: Channel opathchecks registered successfully
Nov 14 23:34:45 localhost.localdomain nagios[30692]: nerd: Fully initialized and ready to rock!
Nov 14 23:34:45 localhost.localdomain nagios[30692]: wproc: Successfully registered manager as @wproc with query handler
Nov 14 23:34:45 localhost.localdomain nagios[30692]: wproc: Registry request: name=Core Worker 30694;pid=30694
Nov 14 23:34:45 localhost.localdomain nagios[30692]: wproc: Registry request: name=Core Worker 30695;pid=30695
Nov 14 23:34:45 localhost.localdomain nagios[30692]: wproc: Registry request: name=Core Worker 30696;pid=30696
Nov 14 23:34:45 localhost.localdomain nagios[30692]: wproc: Registry request: name=Core Worker 30697;pid=30697
Nov 14 23:34:45 localhost.localdomain nagios[30692]: Successfully launched command file worker with pid 30698
And Web interface works fine

And After Hard reboot:

Web Interface : Whoops!
Error: Could not read object configuration data!


Nrope seems to be working OK

Code: Select all

# /usr/local/nagios/libexec/check_nrpe -H 127.0.0.1
NRPE vnrpe-3.0
Nagios Down:

Code: Select all

# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

Nagios Core 4.2.2
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 10-24-2016
License: GPL

Website: https://www.nagios.org
Reading configuration data...
Error in configuration file '/usr/local/nagios/etc/nagios.cfg' - Line 452 (Check result path '/usr/local/nagios/var/spool/checkresults' is not a valid directory)
   Error processing main config file!
journalctl -xe

Code: Select all

-- Unit nagios.service has begun starting up.
Nov 14 23:42:39 localhost.localdomain nagios[2311]: Starting nagios:
Nov 14 23:42:39 localhost.localdomain nagios[2311]: Nagios Core 4.2.2
Nov 14 23:42:39 localhost.localdomain nagios[2311]: Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Nov 14 23:42:39 localhost.localdomain nagios[2311]: Copyright (c) 1999-2009 Ethan Galstad
Nov 14 23:42:39 localhost.localdomain nagios[2311]: Last Modified: 10-24-2016
Nov 14 23:42:39 localhost.localdomain nagios[2311]: License: GPL
Nov 14 23:42:39 localhost.localdomain nagios[2311]: Website: https://www.nagios.org
Nov 14 23:42:39 localhost.localdomain nagios[2311]: Reading configuration data...
Nov 14 23:42:39 localhost.localdomain nagios[2311]: Error in configuration file '/usr/local/nagios/etc/nagios.cfg' - Line 452 (Check result path '/u
Nov 14 23:42:39 localhost.localdomain nagios[2311]: Error processing main config file!
Nov 14 23:42:39 localhost.localdomain polkitd[893]: Unregistered Authentication Agent for unix-process:2306:31530 (system bus name :1.14, object pat
Nov 14 23:42:39 localhost.localdomain systemd[1]: nagios.service: control process exited, code=exited status=8
Nov 14 23:42:39 localhost.localdomain systemd[1]: Failed to start LSB: Starts and stops the Nagios monitoring server.
-- Subject: Unit nagios.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit nagios.service has failed.
--
-- The result is failed.
Nov 14 23:42:39 localhost.localdomain systemd[1]: Unit nagios.service entered failed state.
Nov 14 23:42:39 localhost.localdomain systemd[1]: nagios.service failed.
Creating missing Directory

Code: Select all

install -d -m 744 -o nagios -g nagios /usr/local/nagios/var/spool/checkresults
Restarting Nagios

Code: Select all

# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

Nagios Core 4.2.2
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 10-24-2016
License: GPL

Website: https://www.nagios.org
Reading configuration data...
   Read main config file okay...
   Read object config files okay...

Running pre-flight check on configuration data...

Checking objects...
        Checked 8 services.
        Checked 1 hosts.
        Checked 1 host groups.
        Checked 0 service groups.
        Checked 1 contacts.
        Checked 1 contact groups.
        Checked 24 commands.
        Checked 5 time periods.
        Checked 0 host escalations.
        Checked 0 service escalations.
Checking for circular paths...
        Checked 1 hosts
        Checked 0 service dependencies
        Checked 0 host dependencies
        Checked 5 timeperiods
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...

Total Warnings: 0
Total Errors:   0

Things look okay - No serious problems were detected during the pre-flight check
But web interface is still down and

journalctl -xe

Code: Select all

-- The start-up result is done.
Nov 14 23:48:32 localhost.localdomain nagios[2445]: Nagios 4.2.2 starting... (PID=2445)
Nov 14 23:48:32 localhost.localdomain nagios[2445]: Local time is Mon Nov 14 23:48:32 GMT 2016
Nov 14 23:48:32 localhost.localdomain nagios[2445]: LOG VERSION: 2.0
Nov 14 23:48:32 localhost.localdomain nagios[2445]: qh: Failed to init socket '/usr/local/nagios/var/rw/nagios.qh'. bind() failed: No such file or d
Nov 14 23:48:32 localhost.localdomain nagios[2445]: Error: Failed to initialize query handler. Aborting
History File attached too

Regards

Re: Nagios 4.2.2 + Plugins 2.1.3 + NRPE v3 on CentOS 7

Posted: Mon Nov 14, 2016 7:02 pm
by wice22
Ok , Noticed even stranger thing :)

Managed to fix this Nagios policy like this :

Code: Select all

echo "
module nagios-socket 1.0;
require {
type nagios_t;
type nagios_log_t;
class sock_file { write create unlink };
class unix_stream_socket connectto;
}
allow nagios_t nagios_log_t:sock_file { write create unlink };
allow nagios_t self:unix_stream_socket connectto;
" > /rpms/nagios-socket.te

# yum install policycoreutils-python -y
# cd /rpms;checkmodule -M -m -o nagios-socket.mod nagios-socket.te
# semodule_package -o nagios-socket.pp -m nagios-socket.mod 
# semodule -i nagios-socket.pp
# install -d -m 744 -o nagios -g nagios /usr/local/nagios/var/rw/
# systemctl restart nagios && systemctl status nagios

But :) , after the hard reboot
Directory's :
/usr/local/nagios/var/rw/
and
/usr/local/nagios/var/spool/checkresults
Where wiped out again :)

Anyway, here is the script fixing this (dirty way) attached, but I think Product like nagios Shouldn't have such bug, it seems it wasn't tested on CentOS7 at all....

File attached

Cheers

Re: Nagios 4.2.2 + Plugins 2.1.3 + NRPE v3 on CentOS 7

Posted: Tue Nov 15, 2016 10:24 am
by dwhitfield
Thanks for posting your fix! It's super-valuable since everyone has a slightly different environment. I can tell you that I have 4 CentOS7 VMs currently, but we can't catch everything.

Is it ok for us to go ahead and lock this thread?

Re: Nagios 4.2.2 + Plugins 2.1.3 + NRPE v3 on CentOS 7

Posted: Tue Nov 15, 2016 11:30 am
by wice22
yes sure,

but nice build in fix for from u guys will be appreciate.

cheers

Re: Nagios 4.2.2 + Plugins 2.1.3 + NRPE v3 on CentOS 7

Posted: Tue Nov 15, 2016 1:33 pm
by avandemore
Is /usr/local/nagios/var/* a symlink to tmpfs or something? This would be a non-standard installation and unsupported.

Re: Nagios 4.2.2 + Plugins 2.1.3 + NRPE v3 on CentOS 7

Posted: Tue Nov 15, 2016 4:58 pm
by Box293
I suspect this is related to this issue:
https://github.com/NagiosEnterprises/na ... issues/153

Upgrade NRPE v3 to the latest version (3.0.1).

Re: Nagios 4.2.2 + Plugins 2.1.3 + NRPE v3 on CentOS 7

Posted: Thu Nov 17, 2016 9:55 am
by wice22
avandemore wrote:Is /usr/local/nagios/var/* a symlink to tmpfs or something? This would be a non-standard installation and unsupported.
Official instructions here : https://support.nagios.com/kb/article.p ... ategory=58
and
Official instructions here : https://support.nagios.com/kb/article.php?id=515

which of this Official Manuals lead to "non-standard installation" Did you even go trough the issue with all my respect ?

Did you test it before giving an answer ?
Did you seen the history attached?
Did you check the link to the manual I did follow ?
Anything at all ???

Box293 wrote:I suspect this is related to this issue:
https://github.com/NagiosEnterprises/na ... issues/153

Upgrade NRPE v3 to the latest version (3.0.1).
The instruction from Nagios Clearly states to use 3.0 ( And I expect it to Work ! ) otherwise they may declare it as as "non-standard" installation as Mr "avandemore" Tried to do without even going trough the issue !
I would suggest u lads Test your own product before putting it to the market !

Regards