Error starting Nagios after upgrade

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
rahul334481
Posts: 5
Joined: Mon Oct 23, 2017 1:56 pm

Error starting Nagios after upgrade

Post by rahul334481 »

Please find below:

Code: Select all

Oct 24 10:29:17 nagios-host systemd[1]: nagios.service start operation timed out. Terminating.
Oct 24 10:29:17 nagios-host nagios[1180]: Caught SIGTERM, shutting down...
Oct 24 10:29:17 nagios-host nagios[1057]: Caught SIGTERM, shutting down...
Oct 24 10:29:17 nagios-host nagios[1057]: wproc: Socket to worker Core Worker 1065 broken, removing
Oct 24 10:29:17 nagios-host nagios[1057]: wproc: 'Core Worker 1066' seems to be choked. ret = -1; bufsize = 126: written = 0; errno = 32 (Broken pipe) 
Oct 24 10:29:17 nagios-host nagios[1057]: wproc: 'Core Worker 1068' seems to be choked. ret = -1; bufsize = 126: written = 0; errno = 32 (Broken pipe) 
Oct 24 10:29:17 nagios-host nagios[1057]: wproc: Socket to worker Core Worker 1066 broken, removing
Oct 24 10:29:17 nagios-host nagios[1057]: wproc: 'Core Worker 1067' seems to be choked. ret = -1; bufsize = 124: written = 0; errno = 32 (Broken pipe) 
Oct 24 10:29:17 nagios-host nagios[1057]: wproc: 'Core Worker 1068' seems to be choked. ret = -1; bufsize = 138: written = 0; errno = 32 (Broken pipe) 
Oct 24 10:29:17 nagios-host nagios[1057]: wproc: Socket to worker Core Worker 1067 broken, removing
Oct 24 10:29:17 nagios-host nagios[1057]: wproc: 'Core Worker 1068' seems to be choked. ret = -1; bufsize = 126: written = 0; errno = 32 (Broken pipe) 
Oct 24 10:29:17 nagios-host nagios[1057]: wproc: 'Core Worker 1068' seems to be choked. ret = -1; bufsize = 126: written = 0; errno = 32 (Broken pipe) 
Oct 24 10:29:17 nagios-host nagios[1057]: wproc: Socket to worker Core Worker 1068 broken, removing
Oct 24 10:29:17 nagios-host nagios[1057]: Successfully shutdown... (PID=1057)
Oct 24 10:29:17 nagios-host nagios[1057]: Event broker module 'NERD' deinitialized successfully. 
Oct 24 10:29:17 nagios-host systemd[1]: Failed to start Nagios Network Monitoring.
-- Subject: Unit nagios.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit nagios.service has failed.
-- 
-- The result is failed.
Oct 24 10:29:17 nagios-host systemd[1]: Unit nagios.service entered failed state. 
Oct 24 10:29:17 nagios-host systemd[1]: nagios.service failed.
Version: Nagios Core 4.3.2

Unable to start Nagios after upgrading from 4.1 to 4.3
dwasswa

Re: Error starting Nagios after upgrade

Post by dwasswa »

Hi

Lets first verify your nagios configurations by running the following command and post the output:

Code: Select all

/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

Lets check the object cache file by running the command below:

Code: Select all

 cat /usr/local/nagios/var/objects.cache
This directive is used to specify a file in which a cached copy of object definitions should be stored. The cache file is (re)created every time Nagios is (re)started and is used by the CGIs. It is intended to speed up config file caching in the CGIs and allow you to edit the source object config files while Nagios is running without affecting the output displayed in the CGIs.

Please post output of those commands.
rahul334481
Posts: 5
Joined: Mon Oct 23, 2017 1:56 pm

Re: Error starting Nagios after upgrade

Post by rahul334481 »

Hi Dwasswa:

Please find output as below:

Code: Select all

[root@netmon1 nagios]# /usr/sbin/nagios -v /etc/nagios/nagios.cfg

Nagios Core 4.3.2
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 2017-05-09
License: GPL

Website: https://www.nagios.org
Reading configuration data...
Warning: use_embedded_perl_implicitly is deprecated and will be removed.
Warning: enable_embedded_perl is deprecated and will be removed.
Warning: p1_file is deprecated and will be removed.
Warning: sleep_time is deprecated and will be removed.
Warning: external_command_buffer_slots is deprecated and will be removed. All commands are always processed upon arrival
Warning: command_check_interval is deprecated and will be removed. Commands are always handled on arrival
   Read main config file okay...
WARNING: The normal_check_interval attribute is deprecated and will be removed in future versions. Please use check_interval instead.
WARNING: The retry_check_interval attribute is deprecated and will be removed in future versions. Please use retry_interval instead.
WARNING: The normal_check_interval attribute is deprecated and will be removed in future versions. Please use check_interval instead.
WARNING: The retry_check_interval attribute is deprecated and will be removed in future versions. Please use retry_interval instead.
WARNING: The normal_check_interval attribute is deprecated and will be removed in future versions. Please use check_interval instead.
WARNING: The retry_check_interval attribute is deprecated and will be removed in future versions. Please use retry_interval instead.
WARNING: The normal_check_interval attribute is deprecated and will be removed in future versions. Please use check_interval instead.
WARNING: The retry_check_interval attribute is deprecated and will be removed in future versions. Please use retry_interval instead.
WARNING: The normal_check_interval attribute is deprecated and will be removed in future versions. Please use check_interval instead.
WARNING: The retry_check_interval attribute is deprecated and will be removed in future versions. Please use retry_interval instead.
WARNING: The normal_check_interval attribute is deprecated and will be removed in future versions. Please use check_interval instead.
WARNING: The normal_check_interval attribute is deprecated and will be removed in future versions. Please use check_interval instead.
WARNING: The normal_check_interval attribute is deprecated and will be removed in future versions. Please use check_interval instead.
WARNING: The normal_check_interval attribute is deprecated and will be removed in future versions. Please use check_interval instead.
WARNING: The normal_check_interval attribute is deprecated and will be removed in future versions. Please use check_interval instead.
WARNING: The normal_check_interval attribute is deprecated and will be removed in future versions. Please use check_interval instead.
WARNING: The normal_check_interval attribute is deprecated and will be removed in future versions. Please use check_interval instead.
WARNING: The normal_check_interval attribute is deprecated and will be removed in future versions. Please use check_interval instead.
WARNING: The normal_check_interval attribute is deprecated and will be removed in future versions. Please use check_interval instead.
WARNING: The normal_check_interval attribute is deprecated and will be removed in future versions. Please use check_interval instead.
WARNING: The normal_check_interval attribute is deprecated and will be removed in future versions. Please use check_interval instead.
WARNING: The retry_check_interval attribute is deprecated and will be removed in future versions. Please use retry_interval instead.
WARNING: The normal_check_interval attribute is deprecated and will be removed in future versions. Please use check_interval instead.
WARNING: The retry_check_interval attribute is deprecated and will be removed in future versions. Please use retry_interval instead.
WARNING: The normal_check_interval attribute is deprecated and will be removed in future versions. Please use check_interval instead.
WARNING: The normal_check_interval attribute is deprecated and will be removed in future versions. Please use check_interval instead.
WARNING: The normal_check_interval attribute is deprecated and will be removed in future versions. Please use check_interval instead.
WARNING: The retry_check_interval attribute is deprecated and will be removed in future versions. Please use retry_interval instead.
WARNING: The normal_check_interval attribute is deprecated and will be removed in future versions. Please use check_interval instead.
Warning: Duplicate definition found for service 'ping' on host 'rp-t-d-web2' (config file '/etc/nagios/lulu/all/services/basic-windows.cfg', starting on line 48)
Warning: Duplicate definition found for service 'ping' on host 'rp-t-d-sql1' (config file '/etc/nagios/lulu/all/services/basic-windows.cfg', starting on line 48)
Warning: Duplicate definition found for service 'ping' on host 'rp-t-d-admin1' (config file '/etc/nagios/lulu/all/services/basic-windows.cfg', starting on line 48)
Warning: Duplicate definition found for service 'ping' on host 'winconvert2.dur-test' (config file '/etc/nagios/lulu/all/services/basic-windows.cfg', starting on line 48)
Warning: Duplicate definition found for service 'ping' on host 'winconvert1.dur-test' (config file '/etc/nagios/lulu/all/services/basic-windows.cfg', starting on line 48)
Warning: Duplicate definition found for service 'ping' on host 'pitstop2.test.dur' (config file '/etc/nagios/lulu/all/services/basic-windows.cfg', starting on line 48)
Warning: Duplicate definition found for service 'ping' on host 'pitstop1.test.dur' (config file '/etc/nagios/lulu/all/services/basic-windows.cfg', starting on line 48)
Warning: Duplicate definition found for service 'ping' on host 'rp-t-d-web1' (config file '/etc/nagios/replayphotos/all/services/web.cfg', starting on line 35)
   Read object config files okay...

Running pre-flight check on configuration data...

Checking objects...
	Checked 745 services.
	Checked 118 hosts.
	Checked 59 host groups.
	Checked 23 service groups.
	Checked 23 contacts.
	Checked 13 contact groups.
	Checked 151 commands.
	Checked 6 time periods.
	Checked 0 host escalations.
	Checked 3 service escalations.
Checking for circular paths...
	Checked 118 hosts
	Checked 0 service dependencies
	Checked 0 host dependencies
	Checked 6 timeperiods
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...

Total Warnings: 0
Total Errors:   0

Things look okay - No serious problems were detected during the pre-flight check
objects.cache attached
Attachments
nagios-ojects.cache.txt
(811.87 KiB) Downloaded 465 times
dwasswa

Re: Error starting Nagios after upgrade

Post by dwasswa »

Now restart nagios by running command

Code: Select all

service nagios restart
Post what is in:

Code: Select all

tail /usr/local/nagios/var/nagios.log
rahul334481
Posts: 5
Joined: Mon Oct 23, 2017 1:56 pm

Re: Error starting Nagios after upgrade

Post by rahul334481 »

This doesn't works for some reason:

Code: Select all

[root@netmon1 nagios]# service nagios restart
Redirecting to /bin/systemctl restart  nagios.service
Job for nagios.service failed because a timeout was exceeded. See "systemctl status nagios.service" and "journalctl -xe" for details.
journalctl -xe:

Code: Select all

Oct 24 15:45:01 nagios-host systemd[1]: nagios.service start operation timed out. Terminating.
Oct 24 15:45:01 nagios-host nagios[4185]: Caught SIGTERM, shutting down...
Oct 24 15:45:01 nagios-host nagios[4190]: Caught SIGTERM, shutting down...
Oct 24 15:45:01 nagios-host nagios[4185]: Successfully shutdown... (PID=4185)
Oct 24 15:45:01 nagios-host nagios[4185]: Event broker module 'NERD' deinitialized successfully. 
Oct 24 15:45:02 nagios-host systemd[1]: Failed to start Nagios Network Monitoring.
-- Subject: Unit nagios.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit nagios.service has failed.
-- 
-- The result is failed.
Oct 24 15:45:02 nagios-host systemd[1]: Unit nagios.service entered failed state. 
Oct 24 15:45:02 nagios-host systemd[1]: nagios.service failed.
Oct 24 15:45:02 nagios-host polkitd[629]: Unregistered Authentication Agent for unix-process:4166:379468 (system bus name :1.32, object path /org/freedesktop/Po
rahul334481
Posts: 5
Joined: Mon Oct 23, 2017 1:56 pm

Re: Error starting Nagios after upgrade

Post by rahul334481 »

Last few lines in nagios.log:

Code: Select all

[1508874301] Caught SIGTERM, shutting down...
[1508874301] Caught SIGTERM, shutting down...
[1508874301] Successfully shutdown... (PID=4185)
[1508874301] Event broker module 'NERD' deinitialized successfully.
dwasswa

Re: Error starting Nagios after upgrade

Post by dwasswa »

Its possible that the crons for nagios are not running.

Run the command;

Code: Select all

service crond restart
Then check its status.

Code: Select all

service crond status
rahul334481
Posts: 5
Joined: Mon Oct 23, 2017 1:56 pm

Re: Error starting Nagios after upgrade

Post by rahul334481 »

Crond is up and running. Restarted this and tried staring nagios again but no luck:

Code: Select all

[root@netmon1 nagios]# service crond restart
Redirecting to /bin/systemctl restart  crond.service

[root@netmon1 nagios]# service crond status
Redirecting to /bin/systemctl status  crond.service
● crond.service - Command Scheduler
   Loaded: loaded (/usr/lib/systemd/system/crond.service; enabled; vendor preset: enabled)
   Active: active (running) since Tue 2017-10-24 16:15:31 EDT; 4s ago
 Main PID: 4763 (crond)
   CGroup: /system.slice/crond.service
           └─4763 /usr/sbin/crond -n

Oct 24 16:15:31 nagios-host systemd[1]: Started Command Scheduler.
Oct 24 16:15:31 nagios-host systemd[1]: Starting Command Scheduler...
Oct 24 16:15:31 nagios-host crond[4763]: (CRON) INFO (RANDOM_DELAY will be scaled with factor 48% if used.) 
Oct 24 16:15:31 nagios-host crond[4763]: (CRON) INFO (running with inotify support)
Oct 24 16:15:31 nagios-host crond[4763]: (CRON) INFO (@reboot jobs will be run at computer's startup.)

[root@netmon1 nagios]# service nagios restart
Redirecting to /bin/systemctl restart  nagios.service
Job for nagios.service failed because a timeout was exceeded. See "systemctl status nagios.service" and "journalctl -xe" for details.
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Error starting Nagios after upgrade

Post by dwhitfield »

Complete hunch based on some other errors I've seen, but if you go to Process Info in the UI on the left panel and the disable notifications, do you still have this issue?

Also, I notice you are on 4.3.2 and the latest is 4.3.4. There are some file lock issues in 4.3.2 and 4.3.3, so I would highly suggest either going back to your previous version or upgrading to 4.3.4.
Locked