Monitoring Engine wont start.

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
dacaron
Posts: 9
Joined: Mon Nov 26, 2018 2:00 pm

Re: Monitoring Engine wont start.

Post by dacaron »

this is what fixed my issue
Edit the /usr/local/nagios/etc/nagios.cfg file and change this from

check_for_updates=1
to
check_for_updates=0

Save the change.
Restart nagios

systemctl restart nagios
The automatic update check is for Core only so you do not need to put is back and it is a special version that comes with XI so it should not be updated manually as an XI upgrade will do it for you if needed.

I also had a few issue with the DB that were fixed by truncating some table, running the repair command.
I also had to change dbmaint password and modify the config /usr/local/nagiosxi/html/config.inc.php to add the dbmaint user and password.

None of this fixed the issue of the monitoring process not starting even though I had the green checkmark in the top right corner, so I will not share those commands because they are specific to my case.
dacaron wrote: Wed Mar 20, 2024 8:17 am I am having the same issue on NagiosXI 2024R1. From what I found it seems to haver stopped working on March 17 6:50:00 am GMT.

All that I am aware is that the notifications were turned off at March 17 12:40:00 am GMT and turned back on March 17 5:15:00 am GMT. Monitoring engine was working from there until March 17 6:50:00 am GMT.

I am not finding anything that could have possibly cause this to stopped in this time range.

Code: Select all

[1710658658] SERVICE ALERT: s1523oracqcp;CPU Load Per Core;WARNING;SOFT;1;WARNING - load average per CPU: 2.54, 2.36, 2.35
[1710658683] SERVICE ALERT: HOST;CPU Load Per Core;CRITICAL;SOFT;23;CRITICAL - load average per CPU: 16.57, 18.66, 21.21
[1710658746] SERVICE ALERT: HOST;CPU Load Per Core;CRITICAL;SOFT;12;CRITICAL - load average per CPU: 8.44, 8.83, 7.76
[1710658802] Caught SIGTERM, shutting down...
[1710658802] Caught SIGTERM, shutting down...
[1710658802] Successfully shutdown... (PID=1311834)
[1710658802] NDO-3: Callbacks deregistered
[1710658802] NDO-3: NDO - Shutdown complete
[1710658802] Event broker module '/usr/local/nagios/bin/ndo.so' deinitialized successfully.
[1710658803] Nagios 4.4.13 starting... (PID=2990833)
[1710658803] Local time is Sun Mar 17 03:00:03 EDT 2024
[1710658803] LOG VERSION: 2.0
[1710658803] qh: Socket '/usr/local/nagios/var/rw/nagios.qh' successfully initialized
[1710658803] qh: core query handler registered
[1710658803] qh: echo service query handler registered
[1710658803] qh: help for the query handler registered
[1710658803] wproc: Successfully registered manager as @wproc with query handler
[1710658803] wproc: Registry request: name=Core Worker 2990835;pid=2990835
[1710658803] wproc: Registry request: name=Core Worker 2990837;pid=2990837
[1710658803] wproc: Registry request: name=Core Worker 2990840;pid=2990840
[1710658803] wproc: Registry request: name=Core Worker 2990836;pid=2990836
[1710658803] wproc: Registry request: name=Core Worker 2990839;pid=2990839
[1710658803] wproc: Registry request: name=Core Worker 2990838;pid=2990838
[1710658803] NDO-3: NDO 3.1.0 (c) Copyright 2009-2023 Nagios - Nagios Core Development Team
[1710658803] NDO-3: Callbacks registered
[1710658803] NDO-3: Callbacks registered
[1710658803] Event broker module '/usr/local/nagios/bin/ndo.so' initialized successfully.
[1710658803] Warning: Service 'License Hosts Count' on host 'HOST'  has a notification interval less than its check interval!  Notifications are only re-sent after checks are made, so the effective notification interval will be that of the check interval.
[1710658803] Warning: Host 'HOST' has no default contacts or contactgroups defined!
[1710658803] NDO-3: Started comment thread
[1710658803] NDO-3: Started timed_event thread
[1710658803] NDO-3: Started event_handler thread
[1710658803] NDO-3: Started service_check thread
[1710658803] NDO-3: Started host_check thread
[1710658803] NDO-3: Started downtime thread
[1710658803] NDO-3: Started flapping thread
[1710658803] NDO-3: Started host_status thread
[1710658803] NDO-3: Started service_status thread
[1710658803] NDO-3: Started contact_status thread
[1710658803] NDO-3: Started acknowledgement thread
[1710658803] NDO-3: Started statechange thread
[1710658803] NDO-3: Started notification thread
[1710658803] NDO-3: Ended contact_status thread
[1710658807] NDO-3: Ended host_check thread
[1710658807] NDO-3: Ended host_status thread
[1710658817] NDO-3: Ended service_check thread
[1710658817] NDO-3: Ended flapping thread
[1710658817] NDO-3: Ended acknowledgement thread
[1710658817] NDO-3: Ended statechange thread
[1710658817] NDO-3: Ended event_handler thread
[1710658817] NDO-3: Ended timed_event thread
[1710658817] NDO-3: Ended notification thread
[1710658817] NDO-3: Ended downtime thread
[1710658817] NDO-3: Ended comment thread
[1710658821] NDO-3: Ended service_status thread
sgardil
Posts: 143
Joined: Wed Aug 09, 2023 9:58 am

Re: Monitoring Engine wont start.

Post by sgardil »

dacaron wrote: Thu Mar 28, 2024 7:24 am this is what fixed my issue
Edit the /usr/local/nagios/etc/nagios.cfg file and change this from

check_for_updates=1
to
check_for_updates=0

Save the change.
Restart nagios

systemctl restart nagios
The automatic update check is for Core only so you do not need to put is back and it is a special version that comes with XI so it should not be updated manually as an XI upgrade will do it for you if needed.

I also had a few issue with the DB that were fixed by truncating some table, running the repair command.
I also had to change dbmaint password and modify the config /usr/local/nagiosxi/html/config.inc.php to add the dbmaint user and password.

None of this fixed the issue of the monitoring process not starting even though I had the green checkmark in the top right corner, so I will not share those commands because they are specific to my case.
dacaron wrote: Wed Mar 20, 2024 8:17 am I am having the same issue on NagiosXI 2024R1. From what I found it seems to haver stopped working on March 17 6:50:00 am GMT.

All that I am aware is that the notifications were turned off at March 17 12:40:00 am GMT and turned back on March 17 5:15:00 am GMT. Monitoring engine was working from there until March 17 6:50:00 am GMT.

I am not finding anything that could have possibly cause this to stopped in this time range.

Code: Select all

[1710658658] SERVICE ALERT: s1523oracqcp;CPU Load Per Core;WARNING;SOFT;1;WARNING - load average per CPU: 2.54, 2.36, 2.35
[1710658683] SERVICE ALERT: HOST;CPU Load Per Core;CRITICAL;SOFT;23;CRITICAL - load average per CPU: 16.57, 18.66, 21.21
[1710658746] SERVICE ALERT: HOST;CPU Load Per Core;CRITICAL;SOFT;12;CRITICAL - load average per CPU: 8.44, 8.83, 7.76
[1710658802] Caught SIGTERM, shutting down...
[1710658802] Caught SIGTERM, shutting down...
[1710658802] Successfully shutdown... (PID=1311834)
[1710658802] NDO-3: Callbacks deregistered
[1710658802] NDO-3: NDO - Shutdown complete
[1710658802] Event broker module '/usr/local/nagios/bin/ndo.so' deinitialized successfully.
[1710658803] Nagios 4.4.13 starting... (PID=2990833)
[1710658803] Local time is Sun Mar 17 03:00:03 EDT 2024
[1710658803] LOG VERSION: 2.0
[1710658803] qh: Socket '/usr/local/nagios/var/rw/nagios.qh' successfully initialized
[1710658803] qh: core query handler registered
[1710658803] qh: echo service query handler registered
[1710658803] qh: help for the query handler registered
[1710658803] wproc: Successfully registered manager as @wproc with query handler
[1710658803] wproc: Registry request: name=Core Worker 2990835;pid=2990835
[1710658803] wproc: Registry request: name=Core Worker 2990837;pid=2990837
[1710658803] wproc: Registry request: name=Core Worker 2990840;pid=2990840
[1710658803] wproc: Registry request: name=Core Worker 2990836;pid=2990836
[1710658803] wproc: Registry request: name=Core Worker 2990839;pid=2990839
[1710658803] wproc: Registry request: name=Core Worker 2990838;pid=2990838
[1710658803] NDO-3: NDO 3.1.0 (c) Copyright 2009-2023 Nagios - Nagios Core Development Team
[1710658803] NDO-3: Callbacks registered
[1710658803] NDO-3: Callbacks registered
[1710658803] Event broker module '/usr/local/nagios/bin/ndo.so' initialized successfully.
[1710658803] Warning: Service 'License Hosts Count' on host 'HOST'  has a notification interval less than its check interval!  Notifications are only re-sent after checks are made, so the effective notification interval will be that of the check interval.
[1710658803] Warning: Host 'HOST' has no default contacts or contactgroups defined!
[1710658803] NDO-3: Started comment thread
[1710658803] NDO-3: Started timed_event thread
[1710658803] NDO-3: Started event_handler thread
[1710658803] NDO-3: Started service_check thread
[1710658803] NDO-3: Started host_check thread
[1710658803] NDO-3: Started downtime thread
[1710658803] NDO-3: Started flapping thread
[1710658803] NDO-3: Started host_status thread
[1710658803] NDO-3: Started service_status thread
[1710658803] NDO-3: Started contact_status thread
[1710658803] NDO-3: Started acknowledgement thread
[1710658803] NDO-3: Started statechange thread
[1710658803] NDO-3: Started notification thread
[1710658803] NDO-3: Ended contact_status thread
[1710658807] NDO-3: Ended host_check thread
[1710658807] NDO-3: Ended host_status thread
[1710658817] NDO-3: Ended service_check thread
[1710658817] NDO-3: Ended flapping thread
[1710658817] NDO-3: Ended acknowledgement thread
[1710658817] NDO-3: Ended statechange thread
[1710658817] NDO-3: Ended event_handler thread
[1710658817] NDO-3: Ended timed_event thread
[1710658817] NDO-3: Ended notification thread
[1710658817] NDO-3: Ended downtime thread
[1710658817] NDO-3: Ended comment thread
[1710658821] NDO-3: Ended service_status thread
I am glad you found a fix for most of your issues. I am a little confused if you are still having issues with the monitoring engine not showing a checkmark? From doing some research about the monitoring engine issues there seems to be a theme with having a host or service with no contacts or contact groups. If you are still having the issue you could try adding contacts to the host giving the warning and see if that fixes the issue.
dacaron
Posts: 9
Joined: Mon Nov 26, 2018 2:00 pm

Re: Monitoring Engine wont start.

Post by dacaron »

since I did the config change to disable check_for_updates, the monitoring process shows as running and everything is green in system status.
sgardil wrote: Thu Mar 28, 2024 9:38 am
dacaron wrote: Thu Mar 28, 2024 7:24 am this is what fixed my issue
Edit the /usr/local/nagios/etc/nagios.cfg file and change this from

check_for_updates=1
to
check_for_updates=0

Save the change.
Restart nagios

systemctl restart nagios
The automatic update check is for Core only so you do not need to put is back and it is a special version that comes with XI so it should not be updated manually as an XI upgrade will do it for you if needed.

I also had a few issue with the DB that were fixed by truncating some table, running the repair command.
I also had to change dbmaint password and modify the config /usr/local/nagiosxi/html/config.inc.php to add the dbmaint user and password.

None of this fixed the issue of the monitoring process not starting even though I had the green checkmark in the top right corner, so I will not share those commands because they are specific to my case.
dacaron wrote: Wed Mar 20, 2024 8:17 am I am having the same issue on NagiosXI 2024R1. From what I found it seems to haver stopped working on March 17 6:50:00 am GMT.

All that I am aware is that the notifications were turned off at March 17 12:40:00 am GMT and turned back on March 17 5:15:00 am GMT. Monitoring engine was working from there until March 17 6:50:00 am GMT.

I am not finding anything that could have possibly cause this to stopped in this time range.

Code: Select all

[1710658658] SERVICE ALERT: s1523oracqcp;CPU Load Per Core;WARNING;SOFT;1;WARNING - load average per CPU: 2.54, 2.36, 2.35
[1710658683] SERVICE ALERT: HOST;CPU Load Per Core;CRITICAL;SOFT;23;CRITICAL - load average per CPU: 16.57, 18.66, 21.21
[1710658746] SERVICE ALERT: HOST;CPU Load Per Core;CRITICAL;SOFT;12;CRITICAL - load average per CPU: 8.44, 8.83, 7.76
[1710658802] Caught SIGTERM, shutting down...
[1710658802] Caught SIGTERM, shutting down...
[1710658802] Successfully shutdown... (PID=1311834)
[1710658802] NDO-3: Callbacks deregistered
[1710658802] NDO-3: NDO - Shutdown complete
[1710658802] Event broker module '/usr/local/nagios/bin/ndo.so' deinitialized successfully.
[1710658803] Nagios 4.4.13 starting... (PID=2990833)
[1710658803] Local time is Sun Mar 17 03:00:03 EDT 2024
[1710658803] LOG VERSION: 2.0
[1710658803] qh: Socket '/usr/local/nagios/var/rw/nagios.qh' successfully initialized
[1710658803] qh: core query handler registered
[1710658803] qh: echo service query handler registered
[1710658803] qh: help for the query handler registered
[1710658803] wproc: Successfully registered manager as @wproc with query handler
[1710658803] wproc: Registry request: name=Core Worker 2990835;pid=2990835
[1710658803] wproc: Registry request: name=Core Worker 2990837;pid=2990837
[1710658803] wproc: Registry request: name=Core Worker 2990840;pid=2990840
[1710658803] wproc: Registry request: name=Core Worker 2990836;pid=2990836
[1710658803] wproc: Registry request: name=Core Worker 2990839;pid=2990839
[1710658803] wproc: Registry request: name=Core Worker 2990838;pid=2990838
[1710658803] NDO-3: NDO 3.1.0 (c) Copyright 2009-2023 Nagios - Nagios Core Development Team
[1710658803] NDO-3: Callbacks registered
[1710658803] NDO-3: Callbacks registered
[1710658803] Event broker module '/usr/local/nagios/bin/ndo.so' initialized successfully.
[1710658803] Warning: Service 'License Hosts Count' on host 'HOST'  has a notification interval less than its check interval!  Notifications are only re-sent after checks are made, so the effective notification interval will be that of the check interval.
[1710658803] Warning: Host 'HOST' has no default contacts or contactgroups defined!
[1710658803] NDO-3: Started comment thread
[1710658803] NDO-3: Started timed_event thread
[1710658803] NDO-3: Started event_handler thread
[1710658803] NDO-3: Started service_check thread
[1710658803] NDO-3: Started host_check thread
[1710658803] NDO-3: Started downtime thread
[1710658803] NDO-3: Started flapping thread
[1710658803] NDO-3: Started host_status thread
[1710658803] NDO-3: Started service_status thread
[1710658803] NDO-3: Started contact_status thread
[1710658803] NDO-3: Started acknowledgement thread
[1710658803] NDO-3: Started statechange thread
[1710658803] NDO-3: Started notification thread
[1710658803] NDO-3: Ended contact_status thread
[1710658807] NDO-3: Ended host_check thread
[1710658807] NDO-3: Ended host_status thread
[1710658817] NDO-3: Ended service_check thread
[1710658817] NDO-3: Ended flapping thread
[1710658817] NDO-3: Ended acknowledgement thread
[1710658817] NDO-3: Ended statechange thread
[1710658817] NDO-3: Ended event_handler thread
[1710658817] NDO-3: Ended timed_event thread
[1710658817] NDO-3: Ended notification thread
[1710658817] NDO-3: Ended downtime thread
[1710658817] NDO-3: Ended comment thread
[1710658821] NDO-3: Ended service_status thread
I am glad you found a fix for most of your issues. I am a little confused if you are still having issues with the monitoring engine not showing a checkmark? From doing some research about the monitoring engine issues there seems to be a theme with having a host or service with no contacts or contact groups. If you are still having the issue you could try adding contacts to the host giving the warning and see if that fixes the issue.
sgardil
Posts: 143
Joined: Wed Aug 09, 2023 9:58 am

Re: Monitoring Engine wont start.

Post by sgardil »

Thats good to hear that the issue is fixed however I dont think that this should have to be disabled to fix it. I will take a look into this and see if I can find any issues with it but in the mean time if you have any other issues that you need help with let us know.
Post Reply