Monitoring Engine stops working
-
tongchenkuo
- Posts: 7
- Joined: Mon Sep 19, 2016 11:41 am
Monitoring Engine stops working
We are having VM host server problem that causes the / becomes Read-Only status.
After cold restart, the Monitoring Engine stop working.
--------------------------------------------------------------------------------------------------------------------
OS: CentOS Linux release 7.2.1511 (3.10.0-327.28.2.el7.x86_64)
Nagios XI 5.3.3 manual install
Gnome installed, no proxy, no SSL
All other components (Performance Grapher, Database Backend, etc.) are all green lights
--------------------------------------------------------------------------------------------------------------------
- Execute /usr/local/nagiosxi/scripts/repair_databases.sh completed.
- Trying to upgrade to 5.4.3 failed on both manual and auto update since Monitoring engine not working.
- systemctl restart nagios
Job for nagios.service failed because a configured resource limit was exceeded. See "systemctl status nagios.service" and "journalctl -xe" for details.
- systemctl status nagios
nagios.service - LSB: Starts and stops the Nagios monitoring server
Loaded: loaded (/etc/rc.d/init.d/nagios; bad; vendor preset: disabled)
Active: failed (Result: resources) since Tue 2017-03-28 12:55:58 EDT; 52s ago
Docs: man:systemd-sysv-generator(8)
Process: 62162 ExecStart=/etc/rc.d/init.d/nagios start (code=exited, status=0/SUCCESS)
Mar 28 12:55:56 appprd01nagios.corp.unifirst.com nagios[62192]: wproc: Registry request: name=Core Worker 62208;pid=62208
Mar 28 12:55:56 appprd01nagios.corp.unifirst.com nagios[62192]: wproc: Registry request: name=Core Worker 62211;pid=62211
Mar 28 12:55:56 appprd01nagios.corp.unifirst.com nagios[62192]: wproc: Registry request: name=Core Worker 62212;pid=62212
Mar 28 12:55:56 appprd01nagios.corp.unifirst.com nagios[62192]: wproc: Registry request: name=Core Worker 62213;pid=62213
Mar 28 12:55:56 appprd01nagios.corp.unifirst.com nagios[62162]: Starting nagios: done.
Mar 28 12:55:56 appprd01nagios.corp.unifirst.com systemd[1]: PID 62192 read from file /usr/local/nagios/var/nagios.lock does not exist or i...ombie.
Mar 28 12:55:58 appprd01nagios.corp.unifirst.com systemd[1]: nagios.service never wrote its PID file. Failing.
Mar 28 12:55:58 appprd01nagios.corp.unifirst.com systemd[1]: Failed to start LSB: Starts and stops the Nagios monitoring server.
Mar 28 12:55:58 appprd01nagios.corp.unifirst.com systemd[1]: Unit nagios.service entered failed state.
Mar 28 12:55:58 appprd01nagios.corp.unifirst.com systemd[1]: nagios.service failed.
- /usr/local/nagios/var/nagios.log
[1490720156] ndomod: NDOMOD 2.0.0 (02-28-2014) Copyright (c) 2009 Nagios Core Development Team and Community Contributors
[1490720156] ndomod: I've been compiled with support for revision 402 of the internal Nagios object structures, but the Nagios daemon is currently using revision 403. I'm going to unload so I don't cause any problems...
[1490720156] Error: Function nebmodule_init() in module '/usr/local/nagios/bin/ndomod.o' returned an error. Module will be unloaded.
[1490720156] Event broker module '/usr/local/nagios/bin/ndomod.o' deinitialized successfully.
[1490720156] Error: Failed to load module '/usr/local/nagios/bin/ndomod.o'.
[1490720156] Error: Module loading failed. Aborting.
After cold restart, the Monitoring Engine stop working.
--------------------------------------------------------------------------------------------------------------------
OS: CentOS Linux release 7.2.1511 (3.10.0-327.28.2.el7.x86_64)
Nagios XI 5.3.3 manual install
Gnome installed, no proxy, no SSL
All other components (Performance Grapher, Database Backend, etc.) are all green lights
--------------------------------------------------------------------------------------------------------------------
- Execute /usr/local/nagiosxi/scripts/repair_databases.sh completed.
- Trying to upgrade to 5.4.3 failed on both manual and auto update since Monitoring engine not working.
- systemctl restart nagios
Job for nagios.service failed because a configured resource limit was exceeded. See "systemctl status nagios.service" and "journalctl -xe" for details.
- systemctl status nagios
nagios.service - LSB: Starts and stops the Nagios monitoring server
Loaded: loaded (/etc/rc.d/init.d/nagios; bad; vendor preset: disabled)
Active: failed (Result: resources) since Tue 2017-03-28 12:55:58 EDT; 52s ago
Docs: man:systemd-sysv-generator(8)
Process: 62162 ExecStart=/etc/rc.d/init.d/nagios start (code=exited, status=0/SUCCESS)
Mar 28 12:55:56 appprd01nagios.corp.unifirst.com nagios[62192]: wproc: Registry request: name=Core Worker 62208;pid=62208
Mar 28 12:55:56 appprd01nagios.corp.unifirst.com nagios[62192]: wproc: Registry request: name=Core Worker 62211;pid=62211
Mar 28 12:55:56 appprd01nagios.corp.unifirst.com nagios[62192]: wproc: Registry request: name=Core Worker 62212;pid=62212
Mar 28 12:55:56 appprd01nagios.corp.unifirst.com nagios[62192]: wproc: Registry request: name=Core Worker 62213;pid=62213
Mar 28 12:55:56 appprd01nagios.corp.unifirst.com nagios[62162]: Starting nagios: done.
Mar 28 12:55:56 appprd01nagios.corp.unifirst.com systemd[1]: PID 62192 read from file /usr/local/nagios/var/nagios.lock does not exist or i...ombie.
Mar 28 12:55:58 appprd01nagios.corp.unifirst.com systemd[1]: nagios.service never wrote its PID file. Failing.
Mar 28 12:55:58 appprd01nagios.corp.unifirst.com systemd[1]: Failed to start LSB: Starts and stops the Nagios monitoring server.
Mar 28 12:55:58 appprd01nagios.corp.unifirst.com systemd[1]: Unit nagios.service entered failed state.
Mar 28 12:55:58 appprd01nagios.corp.unifirst.com systemd[1]: nagios.service failed.
- /usr/local/nagios/var/nagios.log
[1490720156] ndomod: NDOMOD 2.0.0 (02-28-2014) Copyright (c) 2009 Nagios Core Development Team and Community Contributors
[1490720156] ndomod: I've been compiled with support for revision 402 of the internal Nagios object structures, but the Nagios daemon is currently using revision 403. I'm going to unload so I don't cause any problems...
[1490720156] Error: Function nebmodule_init() in module '/usr/local/nagios/bin/ndomod.o' returned an error. Module will be unloaded.
[1490720156] Event broker module '/usr/local/nagios/bin/ndomod.o' deinitialized successfully.
[1490720156] Error: Failed to load module '/usr/local/nagios/bin/ndomod.o'.
[1490720156] Error: Module loading failed. Aborting.
-
avandemore
- Posts: 1597
- Joined: Tue Sep 27, 2016 4:57 pm
Re: Monitoring Engine stops working
For some reason it looks like you have a mismatch in ndo vs core. Can you try upgrading to the latest?
https://assets.nagios.com/downloads/nag ... nstall.pdf
https://assets.nagios.com/downloads/nag ... nstall.pdf
Previous Nagios employee
-
tongchenkuo
- Posts: 7
- Joined: Mon Sep 19, 2016 11:41 am
Re: Monitoring Engine stops working
I have been tried to upgrade many times but still got this error message
make[1]: Leaving directory `/tmp/nagiosxi/subcomponents/nagioscore/nagios-4.2.4'
Warning: nagios.service changed on disk. Run 'systemctl daemon-reload' to reload units.
Job for nagios.service failed because a configured resource limit was exceeded. See "systemctl status nagios.service" and "journalctl -xe" for details.
There is no error message in the upgrade.log file, the last couple lines in the log file are
*** Main program, CGIs and HTML files installed ***
You can continue with installing Nagios as follows (type 'make'
without any arguments for a list of all possible options):
make install-init
- This installs the init script in /etc/rc.d/init.d
make install-commandmode
- This installs and configures permissions on the
directory for holding the external command file
make install-config
- This installs sample config files in /usr/local/nagios/etc
make[1]: Leaving directory `/tmp/nagiosxi/subcomponents/nagioscore/nagios-4.2.4'
make[1]: Leaving directory `/tmp/nagiosxi/subcomponents/nagioscore/nagios-4.2.4'
Warning: nagios.service changed on disk. Run 'systemctl daemon-reload' to reload units.
Job for nagios.service failed because a configured resource limit was exceeded. See "systemctl status nagios.service" and "journalctl -xe" for details.
There is no error message in the upgrade.log file, the last couple lines in the log file are
*** Main program, CGIs and HTML files installed ***
You can continue with installing Nagios as follows (type 'make'
without any arguments for a list of all possible options):
make install-init
- This installs the init script in /etc/rc.d/init.d
make install-commandmode
- This installs and configures permissions on the
directory for holding the external command file
make install-config
- This installs sample config files in /usr/local/nagios/etc
make[1]: Leaving directory `/tmp/nagiosxi/subcomponents/nagioscore/nagios-4.2.4'
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Monitoring Engine stops working
I've not seen this before, but lets try forcing the upgrade of ndoutils to get that fixed first
Then lets try starting nagios.
If it starts we can try continuing the upgrade
Code: Select all
cd /tmp/nagiosxi/subcomponents/ndoutils
./upgradeIf it starts we can try continuing the upgrade
Code: Select all
cd /tmp/nagiosxi
./upgrade-
tongchenkuo
- Posts: 7
- Joined: Mon Sep 19, 2016 11:41 am
Re: Monitoring Engine stops working
Failed on cd /tmp/nagiosxi/subcomponents/ndoutils
./upgrade
*** Configuration summary for ndoutils 2.1.2 11-14-2016 ***:
General Options:
-------------------------
NDO2DB user: nagios
NDO2DB group: nagios
NDO2DB tcp port: 5668
Review the options above for accuracy. If they look
okay, type 'make all' to compile the NDO utilities,
or type 'make' to get a list of make options.
cd ./src && make
make[1]: Entering directory `/tmp/nagiosxi/subcomponents/ndoutils/ndoutils-2.1.2/src'
gcc -fPIC -fPIC -g -O2 -I/usr/include/mysql -DHAVE_CONFIG_H -c -o io.o io.c
gcc -fPIC -fPIC -g -O2 -I/usr/include/mysql -DHAVE_CONFIG_H -c -o utils.o utils.c
gcc -fPIC -g -O2 -I/usr/include/mysql -DHAVE_CONFIG_H -o file2sock file2sock.c io.o utils.o -lsystemd -lm -lnsl
gcc -fPIC -g -O2 -I/usr/include/mysql -DHAVE_CONFIG_H -o log2ndo log2ndo.c io.o utils.o -lsystemd -lm -lnsl
make ndo2db-2x
make[2]: Entering directory `/tmp/nagiosxi/subcomponents/ndoutils/ndoutils-2.1.2/src'
gcc -fPIC -g -O2 -I/usr/include/mysql -DHAVE_CONFIG_H -c -o db.o db.c
gcc -fPIC -g -O2 -I/usr/include/mysql -DHAVE_CONFIG_H -D BUILD_NAGIOS_2X -c -o dbhandlers-2x.o dbhandlers.c
gcc -fPIC -g -O2 -I/usr/include/mysql -DHAVE_CONFIG_H -D BUILD_NAGIOS_2X -o ndo2db-2x queue.c ndo2db.c dbhandlers-2x.o io.o utils.o db.o -lsystemd -lnsl -L/usr/lib64/mysql -lmysqlclient -lpthread -lz -lm -lssl -lcrypto -ldl -lm
ndo2db.c:44:31: fatal error: systemd/sd_daemon.h: No such file or directory
#include <systemd/sd_daemon.h>
./upgrade
*** Configuration summary for ndoutils 2.1.2 11-14-2016 ***:
General Options:
-------------------------
NDO2DB user: nagios
NDO2DB group: nagios
NDO2DB tcp port: 5668
Review the options above for accuracy. If they look
okay, type 'make all' to compile the NDO utilities,
or type 'make' to get a list of make options.
cd ./src && make
make[1]: Entering directory `/tmp/nagiosxi/subcomponents/ndoutils/ndoutils-2.1.2/src'
gcc -fPIC -fPIC -g -O2 -I/usr/include/mysql -DHAVE_CONFIG_H -c -o io.o io.c
gcc -fPIC -fPIC -g -O2 -I/usr/include/mysql -DHAVE_CONFIG_H -c -o utils.o utils.c
gcc -fPIC -g -O2 -I/usr/include/mysql -DHAVE_CONFIG_H -o file2sock file2sock.c io.o utils.o -lsystemd -lm -lnsl
gcc -fPIC -g -O2 -I/usr/include/mysql -DHAVE_CONFIG_H -o log2ndo log2ndo.c io.o utils.o -lsystemd -lm -lnsl
make ndo2db-2x
make[2]: Entering directory `/tmp/nagiosxi/subcomponents/ndoutils/ndoutils-2.1.2/src'
gcc -fPIC -g -O2 -I/usr/include/mysql -DHAVE_CONFIG_H -c -o db.o db.c
gcc -fPIC -g -O2 -I/usr/include/mysql -DHAVE_CONFIG_H -D BUILD_NAGIOS_2X -c -o dbhandlers-2x.o dbhandlers.c
gcc -fPIC -g -O2 -I/usr/include/mysql -DHAVE_CONFIG_H -D BUILD_NAGIOS_2X -o ndo2db-2x queue.c ndo2db.c dbhandlers-2x.o io.o utils.o db.o -lsystemd -lnsl -L/usr/lib64/mysql -lmysqlclient -lpthread -lz -lm -lssl -lcrypto -ldl -lm
ndo2db.c:44:31: fatal error: systemd/sd_daemon.h: No such file or directory
#include <systemd/sd_daemon.h>
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Monitoring Engine stops working
Please run the following
Then try the above again
Code: Select all
yum install systemd-devel -y-
jfrickson
Re: Monitoring Engine stops working
Check your ndo2db.service and nagios.service files -- probably in a diretory something like /usr/lib/systemd/system/. There may be an entry in there that says either ProtectSystem=yes or ProtectSystem=full. If there is, either delete the line or set it to ProtectSystem=no. Systemd recently enabled those options and caused problems for quite a few systems.
I don't know why you're getting the fatal error: systemd/sd_daemon.h: No such file or directory error. You could try commenting it out and see if that works.
EDIT: Better yet, #undef HAVE_SYSTEMD in include/config.h
EDIT2: Or take Scott's suggestion
I don't know why you're getting the fatal error: systemd/sd_daemon.h: No such file or directory error. You could try commenting it out and see if that works.
EDIT: Better yet, #undef HAVE_SYSTEMD in include/config.h
EDIT2: Or take Scott's suggestion
-
tongchenkuo
- Posts: 7
- Joined: Mon Sep 19, 2016 11:41 am
Re: Monitoring Engine stops working
The problem is solved.
It looks like there is a typo error in the 5.4.3 upgrade program
From Scott instruction,
cd /tmp/nagiosxi/subcomponents/ndoutils
./upgrade
I got error : ndo2db.c:44:31: fatal error: systemd/sd_daemon.h: No such file or directory
Then I found the systemd/sd_daemon.h should be systemd/sd-daemon.h
So what I do is
1. cp /usr/include/systemd/sd-daemon.h /usr/include/systemd/sd_daemon.h
2. cd /tmp/nagiosxi/subcomponents/ndoutils
3. ./upgrade
The Monitoring Engine is started.
Continue do the 5.4.3 upgrade without problem.
cd /tmp/nagiosxi
./upgrade
Thanks everyone
It looks like there is a typo error in the 5.4.3 upgrade program
From Scott instruction,
cd /tmp/nagiosxi/subcomponents/ndoutils
./upgrade
I got error : ndo2db.c:44:31: fatal error: systemd/sd_daemon.h: No such file or directory
Then I found the systemd/sd_daemon.h should be systemd/sd-daemon.h
So what I do is
1. cp /usr/include/systemd/sd-daemon.h /usr/include/systemd/sd_daemon.h
2. cd /tmp/nagiosxi/subcomponents/ndoutils
3. ./upgrade
The Monitoring Engine is started.
Continue do the 5.4.3 upgrade without problem.
cd /tmp/nagiosxi
./upgrade
Thanks everyone
Re: Monitoring Engine stops working
Glad to hear : ) I assume we can close the thread at this point?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
-
tongchenkuo
- Posts: 7
- Joined: Mon Sep 19, 2016 11:41 am
Re: Monitoring Engine stops working
Yes, this thread can be closed. Thanks,