Page 1 of 2

Monitoring engine won't start (in Prod AND Test)

Posted: Thu May 15, 2014 11:59 am
by snapon_admin
Showing the monitoring engine is stopped and won't start when I click start. Here's what I'm seeing.
monitoring engine not running.png
When I click start I get
Your request was not processed in a timely manner, It may still execute, as the server may be temporarily busy.
I've verified config, all looks good. Written config files, restarted Nagios service, etc. Checks appear to still be running, but we're still getting this issue. Thoughts?

Edit: This is now affecting my test server as well. What do?

Re: Monitoring engine won't start

Posted: Thu May 15, 2014 1:12 pm
by slansing
I just solved something like this on my end, lets check some stuff:

Code: Select all

service nagios restart
service nagios status

service crond restart
service crond status

service ndo2db restart
service ndo2db status

Code: Select all

tail -100 /usr/local/nagios/var/nagios.log
Are you using a mod gearman version that you had on a 2012 XI version on this server right now?

Re: Monitoring engine won't start

Posted: Thu May 15, 2014 1:30 pm
by snapon_admin

Code: Select all

[root@lisl-ngos-01-pv conf]# service nagios restart
Running configuration check...done.
Stopping nagios: .done.
Starting nagios: done.
[root@lisl-ngos-01-pv conf]# service nagios status
nagios (pid 11371) is running...
[root@lisl-ngos-01-pv conf]# service crond restart
Stopping crond:                                            [  OK  ]
Starting crond:                                            [  OK  ]
[root@lisl-ngos-01-pv conf]# service crond status
crond (pid  13208) is running...
[root@lisl-ngos-01-pv conf]# service ndo2db restart
Stopping ndo2db: head: cannot open `/usr/local/nagios/var/ndo2db.lock' for reading: No such file or directory
done.
Starting ndo2db: done.
[root@lisl-ngos-01-pv conf]# service ndo2db status
ndo2db (pid 15870) is running...
[root@lisl-ngos-01-pv conf]# tail -100 /usr/local/nagios/var/nagios.log
[1400178542] Warning: Duplicate definition found for service 'Load' on host 'kenapps04g' (config file '/usr/local/nagios/etc/services/kenapps04g.cfg', starting on line 380)
[1400178542] Warning: Duplicate definition found for service 'ldap process' on host 'kenapps04g' (config file '/usr/local/nagios/etc/services/kenapps04g.cfg', starting on line 364)
[1400178542] Warning: Duplicate definition found for service 'ldap client service' on host 'kenapps04g' (config file '/usr/local/nagios/etc/services/kenapps04g.cfg', starting on line 348)
[1400178542] Warning: Duplicate definition found for service 'inetd service' on host 'kenapps04g' (config file '/usr/local/nagios/etc/services/kenapps04g.cfg', starting on line 332)
[1400178542] Warning: Duplicate definition found for service 'inetd process' on host 'kenapps04g' (config file '/usr/local/nagios/etc/services/kenapps04g.cfg', starting on line 316)
[1400178542] Warning: Duplicate definition found for service 'gss service' on host 'kenapps04g' (config file '/usr/local/nagios/etc/services/kenapps04g.cfg', starting on line 300)
[1400178542] Warning: Duplicate definition found for service 'fmd service' on host 'kenapps04g' (config file '/usr/local/nagios/etc/services/kenapps04g.cfg', starting on line 284)
[1400178542] Warning: Duplicate definition found for service 'fmd process' on host 'kenapps04g' (config file '/usr/local/nagios/etc/services/kenapps04g.cfg', starting on line 268)
[1400178542] Warning: Duplicate definition found for service 'Faults' on host 'kenapps04g' (config file '/usr/local/nagios/etc/services/kenapps04g.cfg', starting on line 252)
[1400178542] Warning: Duplicate definition found for service 'ctmagent7 service' on host 'kenapps04g' (config file '/usr/local/nagios/etc/services/kenapps04g.cfg', starting on line 236)
[1400178542] Warning: Duplicate definition found for service 'ctmagent7 process' on host 'kenapps04g' (config file '/usr/local/nagios/etc/services/kenapps04g.cfg', starting on line 220)
[1400178542] Warning: Duplicate definition found for service 'CPU Stats' on host 'kenapps04g' (config file '/usr/local/nagios/etc/services/kenapps04g.cfg', starting on line 204)
[1400178542] Warning: Duplicate definition found for service 'bind service' on host 'kenapps04g' (config file '/usr/local/nagios/etc/services/kenapps04g.cfg', starting on line 188)
[1400178542] Warning: Duplicate definition found for service 'bind process' on host 'kenapps04g' (config file '/usr/local/nagios/etc/services/kenapps04g.cfg', starting on line 172)
[1400178542] Warning: Duplicate definition found for service 'autofs service' on host 'kenapps04g' (config file '/usr/local/nagios/etc/services/kenapps04g.cfg', starting on line 156)
[1400178542] Warning: Duplicate definition found for service 'autofs process' on host 'kenapps04g' (config file '/usr/local/nagios/etc/services/kenapps04g.cfg', starting on line 140)
[1400178542] Warning: Duplicate definition found for service '/workspace Disk Usage' on host 'kenapps04g' (config file '/usr/local/nagios/etc/services/kenapps04g.cfg', starting on line 124)
[1400178542] Warning: Duplicate definition found for service '/vendor/quest Disk Usage' on host 'kenapps04g' (config file '/usr/local/nagios/etc/services/kenapps04g.cfg', starting on line 108)
[1400178542] Warning: Duplicate definition found for service '/vendor/nbadmin Disk Usage' on host 'kenapps04g' (config file '/usr/local/nagios/etc/services/kenapps04g.cfg', starting on line 93)
[1400178542] Warning: Duplicate definition found for service '/vendor/nagios Disk Usage' on host 'kenapps04g' (config file '/usr/local/nagios/etc/services/kenapps04g.cfg', starting on line 77)
[1400178542] Warning: Duplicate definition found for service '/vendor/ctmlogs Disk Usage' on host 'kenapps04g' (config file '/usr/local/nagios/etc/services/kenapps04g.cfg', starting on line 61)
[1400178542] Warning: Duplicate definition found for service '/vendor/ctmagent7 Disk Usage' on host 'kenapps04g' (config file '/usr/local/nagios/etc/services/kenapps04g.cfg', starting on line 45)
[1400178542] Warning: Duplicate definition found for service '/export/home Disk Usage' on host 'kenapps04g' (config file '/usr/local/nagios/etc/services/kenapps04g.cfg', starting on line 30)
[1400178542] Warning: Duplicate definition found for service '/ Disk Usage' on host 'kenapps04g' (config file '/usr/local/nagios/etc/services/kenapps04g.cfg', starting on line 14)
[1400178542] Warning: Service 'Interface list' on host 'AlgonaIA-Core'  has a notification interval less than its check interval!  Notifications are only re-sent after checks are made, so the effective notification interval will be that of the check interval.
[1400178542] Warning: Service 'Interface list' on host 'ArndellParkNSW-Core'  has a notification interval less than its check interval!  Notifications are only re-sent after checks are made, so the effective notification interval will be that of the check interval.
[1400178542] Warning: Service 'Interface list' on host 'CalgaryAB-Core'  has a notification interval less than its check interval!  Notifications are only re-sent after checks are made, so the effective notification interval will be that of the check interval.
[1400178542] Warning: Service 'Interface list' on host 'CarsonCityNV-RC-Core'  has a notification interval less than its check interval!  Notifications are only re-sent after checks are made, so the effective notification interval will be that of the check interval.
[1400178542] Warning: Service 'Interface list' on host 'CarsonCityNV-VC-Core'  has a notification interval less than its check interval!  Notifications are only re-sent after checks are made, so the effective notification interval will be that of the check interval.
[1400178542] Warning: Service 'Interface list' on host 'CityofIndustryCA-Core'  has a notification interval less than its check interval!  Notifications are only re-sent after checks are made, so the effective notification interval will be that of the check interval.
[1400178542] Warning: Service 'Interface list' on host 'ColumbusGA-Core'  has a notification interval less than its check interval!  Notifications are only re-sent after checks are made, so the effective notification interval will be that of the check interval.
[1400178542] Warning: Service 'Interface list' on host 'ConwayAR-Core'  has a notification interval less than its check interval!  Notifications are only re-sent after checks are made, so the effective notification interval will be that of the check interval.
[1400178542] Warning: Service 'Interface list' on host 'CorkIE-Core'  has a notification interval less than its check interval!  Notifications are only re-sent after checks are made, so the effective notification interval will be that of the check interval.
[1400178542] Warning: Service 'Interface list' on host 'CrystalLakeIL-Core'  has a notification interval less than its check interval!  Notifications are only re-sent after checks are made, so the effective notification interval will be that of the check interval.
[1400178542] Warning: Service 'Interface list' on host 'ElizabethtonTN-Core'  has a notification interval less than its check interval!  Notifications are only re-sent after checks are made, so the effective notification interval will be that of the check interval.
[1400178542] Warning: Service 'Interface list' on host 'ElkmontAL-Core'  has a notification interval less than its check interval!  Notifications are only re-sent after checks are made, so the effective notification interval will be that of the check interval.
[1400178542] Warning: Service 'Interface list' on host 'HarrisburgPA-Core'  has a notification interval less than its check interval!  Notifications are only re-sent after checks are made, so the effective notification interval will be that of the check interval.
[1400178542] Warning: Service 'Interface list' on host 'KenoshaWI-Core'  has a notification interval less than its check interval!  Notifications are only re-sent after checks are made, so the effective notification interval will be that of the check interval.
[1400178542] Warning: Service 'Interface list' on host 'KingsLynnGB-Core'  has a notification interval less than its check interval!  Notifications are only re-sent after checks are made, so the effective notification interval will be that of the check interval.
[1400178542] Warning: Service 'Interface list' on host 'LibertyvilleIL-Core-A'  has a notification interval less than its check interval!  Notifications are only re-sent after checks are made, so the effective notification interval will be that of the check interval.
[1400178542] Warning: Service 'Interface list' on host 'LibertyvilleIL-Core-B'  has a notification interval less than its check interval!  Notifications are only re-sent after checks are made, so the effective notification interval will be that of the check interval.
[1400178542] Warning: Service 'Interface list' on host 'LincolnshireIL-Core'  has a notification interval less than its check interval!  Notifications are only re-sent after checks are made, so the effective notification interval will be that of the check interval.
[1400178542] Warning: Service 'Interface list' on host 'MilwaukeeWI-Core'  has a notification interval less than its check interval!  Notifications are only re-sent after checks are made, so the effective notification interval will be that of the check interval.
[1400178542] Warning: Service 'Interface list' on host 'MississaugaON-Core'  has a notification interval less than its check interval!  Notifications are only re-sent after checks are made, so the effective notification interval will be that of the check interval.
[1400178542] Warning: Service 'Interface list' on host 'MurphyNC-Core'  has a notification interval less than its check interval!  Notifications are only re-sent after checks are made, so the effective notification interval will be that of the check interval.
[1400178542] Warning: Service 'Interface list' on host 'NewmarketON-Core'  has a notification interval less than its check interval!  Notifications are only re-sent after checks are made, so the effective notification interval will be that of the check interval.
[1400178542] Warning: Service 'Interface list' on host 'OliveBranchMS-Core'  has a notification interval less than its check interval!  Notifications are only re-sent after checks are made, so the effective notification interval will be that of the check interval.
[1400178542] Warning: Service 'Interface list' on host 'RobesoniaPA-Core'  has a notification interval less than its check interval!  Notifications are only re-sent after checks are made, so the effective notification interval will be that of the check interval.
[1400178542] Warning: Service 'Interface list' on host 'RochesterHillsMI-Core'  has a notification interval less than its check interval!  Notifications are only re-sent after checks are made, so the effective notification interval will be that of the check interval.
[1400178542] Warning: Service 'Interface list' on host 'SanJoseCA-Core'  has a notification interval less than its check interval!  Notifications are only re-sent after checks are made, so the effective notification interval will be that of the check interval.
[1400178542] Warning: Service 'Client Connections' on host 'Solsrp07' has no default contacts or contactgroups defined!
[1400178542] Warning: Service 'Interface list' on host 'ThroopPA-Core'  has a notification interval less than its check interval!  Notifications are only re-sent after checks are made, so the effective notification interval will be that of the check interval.
[1400178542] Warning: Service 'Interface list' on host 'TlalnepantlaMX-Core'  has a notification interval less than its check interval!  Notifications are only re-sent after checks are made, so the effective notification interval will be that of the check interval.
[1400178542] Warning: Service 'Interface list' on host 'UnterneukirchenDE-Core'  has a notification interval less than its check interval!  Notifications are only re-sent after checks are made, so the effective notification interval will be that of the check interval.
[1400178542] Warning: Service '/ Disk Usage' on host 'kendbms12t on kenapps47g' has no default contacts or contactgroups defined!
[1400178542] Warning: Service '/vendor/ctmagent7 Disk Usage' on host 'kendbms12t on kenapps47g' has no default contacts or contactgroups defined!
[1400178542] Warning: Service '/vendor/ctmlogs Disk Usage' on host 'kendbms12t on kenapps47g' has no default contacts or contactgroups defined!
[1400178542] Warning: Service '/vendor/nagios Disk Usage' on host 'kendbms12t on kenapps47g' has no default contacts or contactgroups defined!
[1400178542] Warning: Service '/vendor/quest Disk Usage' on host 'kendbms12t on kenapps47g' has no default contacts or contactgroups defined!
[1400178542] Warning: Service '/workspace Disk Usage' on host 'kendbms12t on kenapps47g' has no default contacts or contactgroups defined!
[1400178542] Warning: Service 'Ping' on host 'kendbms12t on kenapps47g' has no default contacts or contactgroups defined!
[1400178542] Warning: Service 'Swap Usage' on host 'kendbms12t on kenapps47g' has no default contacts or contactgroups defined!
[1400178542] Warning: Service 'Total Processes' on host 'kendbms12t on kenapps47g' has no default contacts or contactgroups defined!
[1400178542] Warning: Service 'Users' on host 'kendbms12t on kenapps47g' has no default contacts or contactgroups defined!
[1400178542] Warning: Service 'autofs process' on host 'kendbms12t on kenapps47g' has no default contacts or contactgroups defined!
[1400178542] Warning: Service 'autofs service' on host 'kendbms12t on kenapps47g' has no default contacts or contactgroups defined!
[1400178542] Warning: Service 'bind process' on host 'kendbms12t on kenapps47g' has no default contacts or contactgroups defined!
[1400178542] Warning: Service 'bind service' on host 'kendbms12t on kenapps47g' has no default contacts or contactgroups defined!
[1400178542] Warning: Service 'ctmagent process' on host 'kendbms12t on kenapps47g' has no default contacts or contactgroups defined!
[1400178542] Warning: Service 'ctmagent7 service' on host 'kendbms12t on kenapps47g' has no default contacts or contactgroups defined!
[1400178542] Warning: Service 'gss service' on host 'kendbms12t on kenapps47g' has no default contacts or contactgroups defined!
[1400178542] Warning: Service 'inetd process' on host 'kendbms12t on kenapps47g' has no default contacts or contactgroups defined!
[1400178542] Warning: Service 'inetd service' on host 'kendbms12t on kenapps47g' has no default contacts or contactgroups defined!
[1400178542] Warning: Service 'ldap client service' on host 'kendbms12t on kenapps47g' has no default contacts or contactgroups defined!
[1400178542] Warning: Service 'ldap process' on host 'kendbms12t on kenapps47g' has no default contacts or contactgroups defined!
[1400178542] Warning: Service 'local filesystem service' on host 'kendbms12t on kenapps47g' has no default contacts or contactgroups defined!
[1400178542] Warning: Service 'name-service-cache service' on host 'kendbms12t on kenapps47g' has no default contacts or contactgroups defined!
[1400178542] Warning: Service 'nfs client service' on host 'kendbms12t on kenapps47g' has no default contacts or contactgroups defined!
[1400178542] Warning: Service 'nfs status service' on host 'kendbms12t on kenapps47g' has no default contacts or contactgroups defined!
[1400178542] Warning: Service 'sendmail process' on host 'kendbms12t on kenapps47g' has no default contacts or contactgroups defined!
[1400178542] Warning: Service 'sendmail-client service' on host 'kendbms12t on kenapps47g' has no default contacts or contactgroups defined!
[1400178542] Warning: Service 'ssh process' on host 'kendbms12t on kenapps47g' has no default contacts or contactgroups defined!
[1400178542] Warning: Service 'ssh service' on host 'kendbms12t on kenapps47g' has no default contacts or contactgroups defined!
[1400178542] Warning: Service 'system-log service' on host 'kendbms12t on kenapps47g' has no default contacts or contactgroups defined!
[1400178542] Warning: Service 'tcp service' on host 'kendbms12t on kenapps47g' has no default contacts or contactgroups defined!
[1400178542] Warning: Service 'utmp service' on host 'kendbms12t on kenapps47g' has no default contacts or contactgroups defined!
[1400178542] Warning: Host 'kendbms12t on kenapps47g' has no default contacts or contactgroups defined!
[1400178542] Warning: Host 'kengrid01p' has no default contacts or contactgroups defined!
[1400178546] Successfully launched command file worker with pid 11415
[1400178560] HOST ALERT: KingsLynnGB-Sentry;UP;HARD;1;OK - 10.160.250.1: rta 118.520ms, lost 0%
[1400178560] HOST ALERT: KingsLynnGB-Core;UP;HARD;1;OK - 10.160.19.2: rta 117.450ms, lost 0%
[1400178561] HOST ALERT: LidkopingSE-MPLS;UP;HARD;1;OK - 10.173.2.1: rta 146.847ms, lost 0%
[1400178568] ndomod: Error writing to data sink!  Some output may get lost...
[1400178568] ndomod: Please check remote ndo2db log, database connection or SSL Parameters
[1400178571] SERVICE ALERT: MississaugaON-MPLS;Outside Bandwidth;CRITICAL;SOFT;3;CRITICAL - Current BW in: 3484.81Kbps Out: 412.68Kbps
[1400178571] HOST ALERT: MississaugaON-MPLS;UP;HARD;1;OK - 10.94.19.1: rta 189.440ms, lost 0%
[1400178584] ndomod: Successfully reconnected to data sink!  0 items lost, 2156 queued items to flush.
[1400178584] ndomod: Successfully flushed 2156 queued items to data sink.
[1400178584] HOST ALERT: SelangorMY-Sentry;UP;HARD;1;OK - 10.145.213.1: rta 256.611ms, lost 0%
[1400178595] HOST ALERT: olive-branch-ip;UP;HARD;1;OK - 10.39.253.100: rta 45.906ms, lost 0%
[root@lisl-ngos-01-pv conf]#
Don't use mod gearman.

Re: Monitoring engine won't start

Posted: Thu May 15, 2014 1:51 pm
by slansing
Looks like ndo may not have started, what is the output of:

Code: Select all

ll /usr/local/nagios/var/ndo2db.lock
Are you seeing a running engine and running graph in the web interface now?

Re: Monitoring engine won't start

Posted: Thu May 15, 2014 2:03 pm
by snapon_admin

Code: Select all

[root@lisl-ngos-01-pv conf]# ll /usr/local/nagios/var/ndo2db.lock
-rw-r--r--. 1 nagios nagios 6 May 15 13:29 /usr/local/nagios/var/ndo2db.lock
You have new mail in /var/spool/mail/root
[root@lisl-ngos-01-pv conf]#
Nope, still not running.

Re: Monitoring engine won't start

Posted: Thu May 15, 2014 2:38 pm
by snapon_admin
Ummm, just noticed this same thing happened on our test server. Monitoring engine is stopped and can't be started there either? Our test server has a grand total of like 2 servers being monitored on it atm, so not sure what the deal is here. I saw another post about config just sitting there taking forever, which mine does as well. I wonder if his monitoring engine is stopped as well?

Re: Monitoring engine won't start (in Prod AND Test)

Posted: Thu May 15, 2014 2:58 pm
by tmcdonald
I think you'll like what I have to say here: http://support.nagios.com/forum/viewtop ... 335#p98335
tmcdonald wrote:And adding on to what snapon said, I am guessing you can't manually force an immediate check of a service/host?

Good news: abrist and swilkerson are working on this right now and things are looking promising. I had a ticket earlier today that prompted this, so we can confirm that this is a known issue with a possible fix on the way.

I wasn't able to hear the full conversation since I was on the phone, but it is related to how you have SSL configured. I am guessing you both force SSL?

Re: Monitoring engine won't start (in Prod AND Test)

Posted: Thu May 15, 2014 3:01 pm
by snapon_admin
Hahaha, I just replied to that, and you are correct sir. I will keep an eye out for the fix, and thanks for the help as always!

Re: Monitoring engine won't start (in Prod AND Test)

Posted: Thu May 15, 2014 3:31 pm
by tmcdonald
Unzip the attached file and place in /usr/local/nagiosxi/html/includes/

Re: Monitoring engine won't start (in Prod AND Test)

Posted: Mon Jul 21, 2014 10:14 am
by tylergates_ats
Running NagiosXI 2014r1.3 I have tried the utils-backend file listed here and am still getting the 'Process Info' page showing errors as well as the php division by zero warnings after trying to use secure https configured in apache.

Does anyone else have a solution to force https in NagiosXI without encountering these errors?

To recreate I modify nagiosxi.conf to included these directives

Code: Select all

RewriteEngine On
RewriteCond %{HTTPS} !=on
RewriteRule (.*) https://%{SERVER_NAME}%{REQUEST_URI}