Nagios crash - iocache_read()...Bad address

dariusz.nalazek · Post by **dariusz.nalazek** » Mon Jun 15, 2020 4:38 am

Hello,

recently Nagios XI crashes with info in logs:

Jun 14 19:16:45 xxxxx nagios: wproc: iocache_read() from Core Worker 15212 returned -1: Bad address
Jun 14 19:16:45 xxxxx nagios: wproc: iocache_read() from Core Worker 15212 returned -1: Bad address
Jun 14 19:16:45 xxxxx nagios: wproc: iocache_read() from Core Worker 15212 returned -1: Bad address

It's no cause of last upgrade to 5.7.x it start a few days ago before upgrade (@5.6.14), and upgrade to 5.7.1 didn’t solve it...
We expand Nagios monitoring a lot in last period of time, so maybe it's matter of amount of checks or so...

We made some minor changes as workaround, not sure if it's right direction...
1) changed service nagios.service form type forking with "-d" to simple to allow systemd to handle service in regular way (with options Restart=always RestartSec=30).
2) changed limit OS open files from 10k to 256k
3) plus as typical workaround, until some real solution will be applied, we build some "self-healing" service to restart nagios.service, when nagios fails in "hard way"

Nagios XI and OS (RHEL 7) is up to date.

Darek.

benjaminsmith · Post by **benjaminsmith** » Mon Jun 15, 2020 3:45 pm

Hi Darek,

That's an error message coming from the monitoring engine. How often do you have to re-start the nagios service to clear the issue? Also, I'd like to review the logs in the system profile to help troubleshoot the issue.

To send us your system profile.
Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and share in a private message or upload it to the post/ticket, and then reply to this post to bring it up in the queue.

Thank you,
Benjamin

dariusz.nalazek · Post by **dariusz.nalazek** » Tue Jun 16, 2020 5:29 am

3 times in short period of time. (08.06, 10.06, 14.06)
Never happened before.

We had to roll back our server to date 12.06 yesterday, cos of some issue with 5.7.1 and BPI monitoring, but it's different topic.
So logs on Nagios XI server can be inconsistent, lucky all logs we have in Nagios LS for debugging purpose, in case of such need...
Profile send on PM.

Darek.

ssax · Post by **ssax** » Tue Jun 16, 2020 5:13 pm

PHP Fatal error: Call to undefined function get_backend_xml_data() in /usr/local/nagiosxi/html/includes/components/historytab/historytab_do_stuff.php on line 49

The historytab component doesn't work in XI 5.7+.

You can remove the component:

Code: Select all

rm -rf /usr/local/nagiosxi/html/includes/components/historytab

Or edit this file:

Code: Select all

/usr/local/nagiosxi/html/includes/components/historytab/historytab_do_stuff.php

Comment out line 49 to stop it from failing on the comments:

Code: Select all

#$xml_nagios_comments = get_backend_xml_data($args_nagios_comments);

I'm not sure if chanig the nagios.service unit file from forking will allow it to work properly during apply configurations/etc when the nagios service restarts, you may want to test that.

Did you see any improvement when you increased the open limits?

I'm wondering if going from 5.6.14 directly to 5.7.1 (skipping 5.7.0) would resolve that issue.

I'm not really seeing anything else that stands out from your profile.

dariusz.nalazek · Post by **dariusz.nalazek** » Mon Jun 22, 2020 8:02 am

so far is OK, and service is set to forking again, the simple somehow was unstable...

Darek.

ssax · Post by **ssax** » Mon Jun 22, 2020 5:49 pm

Ok, glad it's stable, keep an eye on it and let us know when we're okay to lock this thread up and mark it as resolved.

Thank you!

Nagios Support Forum

Nagios crash - iocache_read()...Bad address

Nagios crash - iocache_read()...Bad address

Re: Nagios crash - iocache_read()...Bad address

Re: Nagios crash - iocache_read()...Bad address

Re: Nagios crash - iocache_read()...Bad address

Re: Nagios crash - iocache_read()...Bad address

Re: Nagios crash - iocache_read()...Bad address