NdoUtils stop working

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
algomas123
Posts: 16
Joined: Mon Jun 27, 2016 8:58 am

Re: NdoUtils stop working

Post by algomas123 »

Thanks again for your help.
You may want to enable debugging in the ndo2db.cfg file and see what error shows up there when the issue happens again.
It is already activated, with a "tail -F" I can see lots of queries running...looks ok. But when the issue happens, it just stop to write queries...it doesn't write errors...just nothing.
We might get more details on what is failing which the developers could use.
Of course, whatever I could help I will.
Nagios Core only uses the MYSQL database to store it's information / status for other 3rd party tools to use, it doesn't use it to run.
Yes, sorry. What I meant was that I had on database information obtained meanwhile this issue was happening. I had no message queue , ndo2db using 90% CPU (I guess because of the infinite for loop), /var/log/message printing error...but inserting data to database (not sure if inserting, but at least I had recent data)...make not sense!!
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: NdoUtils stop working

Post by tgriep »

Thanks for your help, the next time is happens, could you post the last 50 lines of the ndo2db error log and the the output of ipcs -q and anything else you find?
Another thing you can look at are the settings for the MYSQL server. Maybe increasing buffers, connections may help in this issue.
Be sure to check out our Knowledgebase for helpful articles and solutions!
algomas123
Posts: 16
Joined: Mon Jun 27, 2016 8:58 am

Re: NdoUtils stop working

Post by algomas123 »

Ok, I will do that.

Meanwhile, I observed in ndo2db log that I have LOTS (like 10 or 15 times more than another query) queries similar to:

Code: Select all

DELETE FROM nagios_timedeventqueue WHERE instance_id='1' AND event_type='8' AND scheduled_time=FROM_UNIXTIME(1467305288) AND recurring_event='1' AND object_id='0'
Sometime it change the event_type, another times it change object_id...but most of them are exactly that query...

I have query that table and it is always empty...

Is it normal?
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: NdoUtils stop working

Post by tgriep »

Yes, those logs are normal, it is just the server removing unneeded data.
Be sure to check out our Knowledgebase for helpful articles and solutions!
algomas123
Posts: 16
Joined: Mon Jun 27, 2016 8:58 am

Re: NdoUtils stop working

Post by algomas123 »

Ok, now it is crashing:

ndo2db.debug:

Code: Select all

[1467361331.389468] [002.0] [pid=10649] DELETE FROM nagios_timedeventqueue WHERE instance_id='1' AND event_type='12' AND scheduled_time=FROM_UNIXTIME(1467361331) AND recurring_event='0' AND object_id='398'
[1467361335.403985] [002.0] [pid=10649] DELETE FROM nagios_timedeventqueue WHERE instance_id='1' AND event_type='0' AND scheduled_time=FROM_UNIXTIME(1467361335) AND recurring_event='0' AND object_id='428'
[1467361335.404330] [002.0] [pid=10649] DELETE FROM nagios_timedeventqueue WHERE instance_id='1' AND scheduled_time<FROM_UNIXTIME(1467361335)
[1467361335.404529] [002.0] [pid=10649] DELETE FROM nagios_timedeventqueue WHERE instance_id='1' AND event_type='0' AND scheduled_time=FROM_UNIXTIME(1467361335) AND recurring_event='0' AND object_id='428'
[1467361335.404758] [002.0] [pid=10649] INSERT INTO nagios_programstatus SET instance_id='1', status_update_time=FROM_UNIXTIME(1467361335), program_start_time=FROM_UNIXTIME(1467282948), is_currently_running='1', process_id='10640', daemon_mode='1', last_command_check=FROM_UNIXTIME(0), last_log_rotation=FROM_UNIXTIME(1467323999), notifications_enabled='1', active_service_checks_enabled='1', passive_service_checks_enabled='1', active_host_checks_enabled='1', passive_host_checks_enabled='1', event_handlers_enabled='1', flap_detection_enabled='1', failure_prediction_enabled='0', process_performance_data='1', obsess_over_hosts='0', obsess_over_services='0', modified_host_attributes='0', modified_service_attributes='0', global_host_event_handler='', global_service_event_handler='' ON DUPLICATE KEY UPDATE instance_id='1', status_update_time=FROM_UNIXTIME(1467361335), program_start_time=FROM_UNIXTIME(1467282948), is_currently_running='1', process_id='10640', daemon_mode='1', last_command_check=FROM_UNIXTIME(0), last_log_rotation=FROM_UNIXTIME(1467323999), notifications_enabled='1', active_service_checks_enabled='1', passive_service_checks_enabled='1', active_host_checks_enabled='1', passive_host_checks_enabled='1', event_handlers_enabled='1', flap_detection_enabled='1', failure_prediction_enabled='0', process_performance_data='1', obsess_over_hosts='0', obsess_over_services='0', modified_host_attributes='0', modified_service_attributes='0', global_host_event_handler='', global_service_event_handler=''
[1467361337.976287] [002.0] [pid=10649] DELETE FROM nagios_timedeventqueue WHERE instance_id='1' AND event_type='99' AND scheduled_time=FROM_UNIXTIME(1467361338) AND recurring_event='1' AND object_id='0'
[1467361337.976852] [002.0] [pid=10649] INSERT INTO nagios_systemcommands SET instance_id='1', start_time=FROM_UNIXTIME(1467361337), start_time_usec='976182', end_time=FROM_UNIXTIME(0), end_time_usec='0', command_line='/bin/mv /usr/local/pnp4nagios/var/service-perfdata /usr/local/pnp4nagios/var/spool/service-perfdata\.1467361337', timeout='5', early_timeout='0', execution_time='0.000000', return_code='0', output='', long_output='' ON DUPLICATE KEY UPDATE instance_id='1', start_time=FROM_UNIXTIME(1467361337), start_time_usec='976182', end_time=FROM_UNIXTIME(0), end_time_usec='0', command_line='/bin/mv /usr/local/pnp4nagios/var/service-perfdata /usr/local/pnp4nagios/var/spool/service-perfdata\.1467361337', timeout='5', early_timeout='0', execution_time='0.000000', return_code='0', output='', long_output=''
[1467361337.990155] [002.0] [pid=10649] INSERT INTO nagios_systemcommands SET instance_id='1', start_time=FROM_UNIXTIME(1467361337), start_time_usec='976182', end_time=FROM_UNIXTIME(1467361337), end_time_usec='989896', command_line='/bin/mv /usr/local/pnp4nagios/var/service-perfdata /usr/local/pnp4nagios/var/spool/service-perfdata\.1467361337', timeout='5', early_timeout='0', execution_time='0.013000', return_code='0', output='', long_output='' ON DUPLICATE KEY UPDATE instance_id='1', start_time=FROM_UNIXTIME(1467361337), start_time_usec='976182', end_time=FROM_UNIXTIME(1467361337), end_time_usec='989896', command_line='/bin/mv /usr/local/pnp4nagios/var/service-perfdata /usr/local/pnp4nagios/var/spool/service-perfdata\.1467361337', timeout='5', early_timeout='0', execution_time='0.013000', return_code='0', output='', long_output=''
[1467361337.990651] [002.0] [pid=10649] DELETE FROM nagios_timedeventqueue WHERE instance_id='1' AND event_type='99' AND scheduled_time=FROM_UNIXTIME(1467361338) AND recurring_event='1' AND object_id='0'
[1467361337.990871] [002.0] [pid=10649] DELETE FROM nagios_timedeventqueue WHERE instance_id='1' AND event_type='99' AND scheduled_time=FROM_UNIXTIME(1467361338) AND recurring_event='1' AND object_id='0'
[1467361337.991078] [002.0] [pid=10649] INSERT INTO nagios_systemcommands SET instance_id='1', start_time=FROM_UNIXTIME(1467361337), start_time_usec='990446', end_time=FROM_UNIXTIME(0), end_time_usec='0', command_line='/bin/mv /usr/local/pnp4nagios/var/host-perfdata /usr/local/pnp4nagios/var/spool/host-perfdata\.1467361337', timeout='5', early_timeout='0', execution_time='0.000000', return_code='0', output='', long_output='' ON DUPLICATE KEY UPDATE instance_id='1', start_time=FROM_UNIXTIME(1467361337), start_time_usec='990446', end_time=FROM_UNIXTIME(0), end_time_usec='0', command_line='/bin/mv /usr/local/pnp4nagios/var/host-perfdata /usr/local/pnp4nagios/var/spool/host-perfdata\.1467361337', timeout='5', early_timeout='0', execution_time='0.000000', return_code='0', output='', long_output=''
[1467361338.004438] [002.0] [pid=10649] INSERT INTO nagios_systemcommands SET instance_id='1', start_time=FROM_UNIXTIME(1467361337), start_time_usec='990446', end_time=FROM_UNIXTIME(1467361338), end_time_usec='3', command_line='/bin/mv /usr/local/pnp4nagios/var/host-perfdata /usr/local/pnp4nagios/var/spool/host-perfdata\.1467361337', timeout='5', early_timeout='0', execution_time='0.014000', return_code='0', output='', long_output='' ON DUPLICATE KEY UPDATE instance_id='1', start_time=FROM_UNIXTIME(1467361337), start_time_usec='990446', end_time=FROM_UNIXTIME(1467361338), end_time_usec='3', command_line='/bin/mv /usr/local/pnp4nagios/var/host-perfdata /usr/local/pnp4nagios/var/spool/host-perfdata\.1467361337', timeout='5', early_timeout='0', execution_time='0.014000', return_code='0', output='', long_output=''
[1467361338.008017] [002.0] [pid=10649] DELETE FROM nagios_timedeventqueue WHERE instance_id='1' AND event_type='99' AND scheduled_time=FROM_UNIXTIME(1467361338) AND recurring_event='1' AND object_id='0'
[1467361338.008422] [002.0] [pid=10649] DELETE FROM nagios_timedeventqueue WHERE instance_id='1' AND event_type='8' AND scheduled_time=FROM_UNIXTIME(1467361338) AND recurring_event='1' AND object_id='0'
[1467361338.010277] [002.0] [pid=10649] DELETE FROM nagios_timedeventqueue WHERE instance_id='1' AND event_type='8' AND scheduled_time=FROM_UNIXTIME(1467361338) AND recurring_event='1' AND object_id='0'
[1467361338.010531] [002.0] [pid=10649] DELETE FROM nagios_timedeventqueue WHERE instance_id='1' AND event_type='5' AND scheduled_time=FROM_UNIXTIME(1467361338) AND recurring_event='1' AND object_id='0'
[1467361338.010855] [002.0] [pid=10649] DELETE FROM nagios_timedeventqueue WHERE instance_id='1' AND event_type='5' AND scheduled_time=FROM_UNIXTIME(1467361338) AND recurring_event='1' AND object_id='0'
and it is not quering anymore...


top output:
Capture.JPG
mysqld do not appear there but it is running. I can query to the bbdd without problem, and it runs fast.

ipcs -q
Capture2.JPG
just nothing!!!

/var/log/messages
Capture3.JPG
and lots of queue recv error: Invalid argument...


hope this help!!

thank!!

EDIT: I used to fix the issue just restarting nagios... so I guess that problem is not with ndo2db...
Last edited by algomas123 on Mon Jul 04, 2016 10:22 am, edited 1 time in total.
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: NdoUtils stop working

Post by Box293 »

Are you still having a problem?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
algomas123
Posts: 16
Joined: Mon Jun 27, 2016 8:58 am

Re: NdoUtils stop working

Post by algomas123 »

Yes, I do.
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: NdoUtils stop working

Post by Box293 »

Can you please do the following:

Code: Select all

service nagios stop
service ndo2db stop
service mysqld restart
service ndo2db start
service nagios start
After Nagios has started, please run this command:

Code: Select all

ipcs -q
It should only show one nagios queue.

Does this resolve the problem?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
algomas123
Posts: 16
Joined: Mon Jun 27, 2016 8:58 am

Re: NdoUtils stop working

Post by algomas123 »

Hello!

yes, it solves the problem. At fact, just restarting nagios solves the problem too.

But after some hours it is crashing again...

Currently, I have a crontab that restart nagios every hour...but I think it is not an elegant solution...
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: NdoUtils stop working

Post by tgriep »

Can you post your nagios.cfg and the ndomod.cfg file so we can view them?
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked