Monitoring Engine stops working
Posted: Tue Mar 28, 2017 12:10 pm
We are having VM host server problem that causes the / becomes Read-Only status.
After cold restart, the Monitoring Engine stop working.
--------------------------------------------------------------------------------------------------------------------
OS: CentOS Linux release 7.2.1511 (3.10.0-327.28.2.el7.x86_64)
Nagios XI 5.3.3 manual install
Gnome installed, no proxy, no SSL
All other components (Performance Grapher, Database Backend, etc.) are all green lights
--------------------------------------------------------------------------------------------------------------------
- Execute /usr/local/nagiosxi/scripts/repair_databases.sh completed.
- Trying to upgrade to 5.4.3 failed on both manual and auto update since Monitoring engine not working.
- systemctl restart nagios
Job for nagios.service failed because a configured resource limit was exceeded. See "systemctl status nagios.service" and "journalctl -xe" for details.
- systemctl status nagios
nagios.service - LSB: Starts and stops the Nagios monitoring server
Loaded: loaded (/etc/rc.d/init.d/nagios; bad; vendor preset: disabled)
Active: failed (Result: resources) since Tue 2017-03-28 12:55:58 EDT; 52s ago
Docs: man:systemd-sysv-generator(8)
Process: 62162 ExecStart=/etc/rc.d/init.d/nagios start (code=exited, status=0/SUCCESS)
Mar 28 12:55:56 appprd01nagios.corp.unifirst.com nagios[62192]: wproc: Registry request: name=Core Worker 62208;pid=62208
Mar 28 12:55:56 appprd01nagios.corp.unifirst.com nagios[62192]: wproc: Registry request: name=Core Worker 62211;pid=62211
Mar 28 12:55:56 appprd01nagios.corp.unifirst.com nagios[62192]: wproc: Registry request: name=Core Worker 62212;pid=62212
Mar 28 12:55:56 appprd01nagios.corp.unifirst.com nagios[62192]: wproc: Registry request: name=Core Worker 62213;pid=62213
Mar 28 12:55:56 appprd01nagios.corp.unifirst.com nagios[62162]: Starting nagios: done.
Mar 28 12:55:56 appprd01nagios.corp.unifirst.com systemd[1]: PID 62192 read from file /usr/local/nagios/var/nagios.lock does not exist or i...ombie.
Mar 28 12:55:58 appprd01nagios.corp.unifirst.com systemd[1]: nagios.service never wrote its PID file. Failing.
Mar 28 12:55:58 appprd01nagios.corp.unifirst.com systemd[1]: Failed to start LSB: Starts and stops the Nagios monitoring server.
Mar 28 12:55:58 appprd01nagios.corp.unifirst.com systemd[1]: Unit nagios.service entered failed state.
Mar 28 12:55:58 appprd01nagios.corp.unifirst.com systemd[1]: nagios.service failed.
- /usr/local/nagios/var/nagios.log
[1490720156] ndomod: NDOMOD 2.0.0 (02-28-2014) Copyright (c) 2009 Nagios Core Development Team and Community Contributors
[1490720156] ndomod: I've been compiled with support for revision 402 of the internal Nagios object structures, but the Nagios daemon is currently using revision 403. I'm going to unload so I don't cause any problems...
[1490720156] Error: Function nebmodule_init() in module '/usr/local/nagios/bin/ndomod.o' returned an error. Module will be unloaded.
[1490720156] Event broker module '/usr/local/nagios/bin/ndomod.o' deinitialized successfully.
[1490720156] Error: Failed to load module '/usr/local/nagios/bin/ndomod.o'.
[1490720156] Error: Module loading failed. Aborting.
After cold restart, the Monitoring Engine stop working.
--------------------------------------------------------------------------------------------------------------------
OS: CentOS Linux release 7.2.1511 (3.10.0-327.28.2.el7.x86_64)
Nagios XI 5.3.3 manual install
Gnome installed, no proxy, no SSL
All other components (Performance Grapher, Database Backend, etc.) are all green lights
--------------------------------------------------------------------------------------------------------------------
- Execute /usr/local/nagiosxi/scripts/repair_databases.sh completed.
- Trying to upgrade to 5.4.3 failed on both manual and auto update since Monitoring engine not working.
- systemctl restart nagios
Job for nagios.service failed because a configured resource limit was exceeded. See "systemctl status nagios.service" and "journalctl -xe" for details.
- systemctl status nagios
nagios.service - LSB: Starts and stops the Nagios monitoring server
Loaded: loaded (/etc/rc.d/init.d/nagios; bad; vendor preset: disabled)
Active: failed (Result: resources) since Tue 2017-03-28 12:55:58 EDT; 52s ago
Docs: man:systemd-sysv-generator(8)
Process: 62162 ExecStart=/etc/rc.d/init.d/nagios start (code=exited, status=0/SUCCESS)
Mar 28 12:55:56 appprd01nagios.corp.unifirst.com nagios[62192]: wproc: Registry request: name=Core Worker 62208;pid=62208
Mar 28 12:55:56 appprd01nagios.corp.unifirst.com nagios[62192]: wproc: Registry request: name=Core Worker 62211;pid=62211
Mar 28 12:55:56 appprd01nagios.corp.unifirst.com nagios[62192]: wproc: Registry request: name=Core Worker 62212;pid=62212
Mar 28 12:55:56 appprd01nagios.corp.unifirst.com nagios[62192]: wproc: Registry request: name=Core Worker 62213;pid=62213
Mar 28 12:55:56 appprd01nagios.corp.unifirst.com nagios[62162]: Starting nagios: done.
Mar 28 12:55:56 appprd01nagios.corp.unifirst.com systemd[1]: PID 62192 read from file /usr/local/nagios/var/nagios.lock does not exist or i...ombie.
Mar 28 12:55:58 appprd01nagios.corp.unifirst.com systemd[1]: nagios.service never wrote its PID file. Failing.
Mar 28 12:55:58 appprd01nagios.corp.unifirst.com systemd[1]: Failed to start LSB: Starts and stops the Nagios monitoring server.
Mar 28 12:55:58 appprd01nagios.corp.unifirst.com systemd[1]: Unit nagios.service entered failed state.
Mar 28 12:55:58 appprd01nagios.corp.unifirst.com systemd[1]: nagios.service failed.
- /usr/local/nagios/var/nagios.log
[1490720156] ndomod: NDOMOD 2.0.0 (02-28-2014) Copyright (c) 2009 Nagios Core Development Team and Community Contributors
[1490720156] ndomod: I've been compiled with support for revision 402 of the internal Nagios object structures, but the Nagios daemon is currently using revision 403. I'm going to unload so I don't cause any problems...
[1490720156] Error: Function nebmodule_init() in module '/usr/local/nagios/bin/ndomod.o' returned an error. Module will be unloaded.
[1490720156] Event broker module '/usr/local/nagios/bin/ndomod.o' deinitialized successfully.
[1490720156] Error: Failed to load module '/usr/local/nagios/bin/ndomod.o'.
[1490720156] Error: Module loading failed. Aborting.