ndo2db error troubleshooting

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
vnc786
Posts: 68
Joined: Thu Aug 29, 2013 8:45 am

ndo2db error troubleshooting

Post by vnc786 »

hi team,

just need to know what are the causes and what one should take preventive measures to avoid ndo2db error.

i am always getting below error.
host=2780
services=25000
nagiosxi 2014 2.7

By the way profile.zip is already on the way. this is a general question just need to know. Followed FAQ, tried repairing mysql. After repairing it seems to work perfectly. Just want to share a incident.

Nagios instance is running on VM. Couple of days back there was suddenly surge in CPU 7000+ load average (can show you graph but not right now). Monitoring Engine was not working. Immediate steps taken to resolve the issue was Hard Reboot the server. We were not able to login to server nor NagiosXI URL was loading so unable to make out who was on top command. After that did mysql repair and things where working fine.

Code: Select all

Nov 10 08:27:24 NAGPRDAPP1 ndo2db: Message sent to queue. 
Nov 10 08:27:24 NAGPRDAPP1 ndo2db: Warning: queue send error, retrying... 
Nov 10 08:27:25 NAGPRDAPP1 ndo2db: Message sent to queue. 
Nov 10 08:27:26 NAGPRDAPP1 ndo2db: Warning: queue send error, retrying... 
Nov 10 08:27:27 NAGPRDAPP1 ndo2db: Message sent to queue. 
Nov 10 08:27:27 NAGPRDAPP1 ndo2db: Warning: queue send error, retrying... 
Nov 10 08:27:28 NAGPRDAPP1 ndo2db: Message sent to queue. 
Nov 10 08:27:28 NAGPRDAPP1 ndo2db: Warning: queue send error, retrying... 
Nov 10 08:27:29 NAGPRDAPP1 ndo2db: Message sent to queue. 
Nov 10 08:27:29 NAGPRDAPP1 ndo2db: Warning: queue send error, retrying... 
Nov 10 08:27:30 NAGPRDAPP1 ndo2db: Message sent to queue. 
Nov 10 10:13:28 NAGPRDAPP1 ndo2db: Error: mysql_query() failed for 'UPDATE nagios_conninfo SET disconnect_time=NOW(), last_checkin_time=NOW(), data_end_time=FROM_UNIXTIME(0), bytes_processed='0', lines_processed='0', entries_processed='0' WHERE conninfo_id='0'' 
Nov 10 10:13:28 NAGPRDAPP1 ndo2db: mysql_error: 'MySQL server has gone away' 
Nov 10 10:13:28 NAGPRDAPP1 ndo2db: Error: Connection to MySQL database has been lost! 
Nov 10 13:00:01 NAGPRDAPP1 rsyslogd-2177: imuxsock lost 1188 messages from pid 26690 due to rate-limiting 
Nov 10 16:32:42 NAGPRDAPP1 ndo2db: Error: mysql_query() failed for 'UPDATE nagios_conninfo SET disconnect_time=NOW(), last_checkin_time=NOW(), data_end_time=FROM_UNIXTIME(0), bytes_processed='0', lines_processed='0', entries_processed='0' WHERE conninfo_id='0'' Nov 10 16:32:42 NAGPRDAPP1 ndo2db: mysql_error: 'MySQL server has gone away' 
Nov 10 16:32:42 NAGPRDAPP1 ndo2db: Error: Connection to MySQL database has been lost! 
Nov 10 18:14:01 NAGPRDAPP1 auditd[1618]: Audit daemon rotating log files
--thanks
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: ndo2db error troubleshooting

Post by rkennedy »

While I'm glad you were able to fix this - it could have been some sort of loop that occurred at that time. To ensure your system is running properly, I have a few questions to help narrow this down -

1. Are you running local SQL or offloaded SQL?
2. What are the specs of your server running Nagios?
3. Can you post the output of top|head -5
4. Were there any network issues going on at the time this happened?
Former Nagios Employee
vnc786
Posts: 68
Joined: Thu Aug 29, 2013 8:45 am

Re: ndo2db error troubleshooting

Post by vnc786 »

rkennedy wrote: 1. Are you running local SQL or offloaded SQL?
Local planning to offload. Using nagiosxi for 2 years but never tried offload. What is your suggestion on large installation should i go for offload. By the way we do lot changes using bulk tool, and that to some time multiple people are editing services via ccm.

I have tunned my.cfg i will post the conf file here. Also i have enabled slow queries log so i do have log for that moment when the CPU was spiked.
2. What are the specs of your server running Nagios?
24 GB Memory,24 Cores
FYI 70-80 percent of services are in business hour only.

Code: Select all

3. Can you post the output of [icode]top|head -5[/icode]
Currently there is no load on server. Will provide output of top once i have access.
4. Were there any network issues going on at the time this happened?
No, i am monitoring this server(ServerA) from another NagiosXI server(ServerB). I checked ping graph but not latency RTA ,etc
User avatar
hsmith
Agent Smith
Posts: 3539
Joined: Thu Jul 30, 2015 11:09 am
Location: 127.0.0.1
Contact:

Re: ndo2db error troubleshooting

Post by hsmith »

This may not be the issue, but 24 cores is a little high. Do you think you may be able to tune that down a little? This article has some information expanding on what I am talking about: http://www.gabesvirtualworld.com/how-to ... rformance/


Offloading is a good performance increase. If you're not using a RAMDisk you may also want to look into that. https://assets.nagios.com/downloads/nag ... giosXI.pdf is a good article describing RAMDisks.
Former Nagios Employee.
me.
Locked