Page 1 of 1

live data timeout settings

Posted: Fri May 14, 2021 4:15 pm
by nosajche
Hello,

We are trying to get our Fusion environment to fuse with Nagios XI and pull our Log Server data and we are running into some problems.

We are running RHEL 7.9 (Maipo) on all boxes:

Fusion - v4.1.8
XI - v5.8.1
Log Server - v2.1.7

After successfully fusing the XI boxes (all green check marks for XI Fused servers and Log Server API authentication), we are seeing the following error message constantly in fusion.log:

Code: Select all

[2021-05-14 17:02:41] [SYSTEM] [ERROR]: poll_server() CHECK YOUR LIVE_DATA_TIMEOUT SETTINGS. IT MAY NEED INCREASED
In addition, dberrors.log is constantly growing with the following message:

Code: Select all

Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction
We have already adjusted the timeout values and changed the Polling config:
Memory Limit - 1 G
Simultaneous Pollers - 4
Live Data Timeout - 300 seconds

We have some of the data generating in the dashlets but not all of the information-- any ideas on what to troubleshoot?


Thanks.

Re: live data timeout settings

Posted: Tue May 18, 2021 9:54 am
by benjaminsmith
HI,

It looks like the system may be having trouble processing all the incoming data, let's run a few tests to confirm this.

1. If these are large systems, try increasing the polling interval beyond 300 seconds. You can set this value globally or on a per-server basis.

2. For testing purposes, try taking the Live Data Timeout setting, to a very large number (600) to see if that resolves the error message in the fusion log.

3. Try to truncate the polled data from the table so the server will start over with fresh data. Run this as root.

Code: Select all

cd /usr/local/nagiosfusion/scripts
./truncate_polled.php
This will remove all of the temporary data that is used in the Fusion interface so you will have to wait until the next time a poll happens for updated data.

Lastly, double-check the disk space on this server, just to rule that out.

Regards,
Benjamin

Re: live data timeout settings

Posted: Wed Jun 02, 2021 10:12 am
by nosajche
Hi Benjamin,

I've adjusted the polling interval for the larger servers to 600 seconds, the timeout to 600 seconds and ran the truncate_polled.php script but am still receiving the same errors re: data timeout.

Disk space is also not an issue, definitely plenty of that to go around.

Also looking at other threads, I tried the CURL command using the fuse key on the largest XI server and the CURL command completes in pretty good time:

Code: Select all

real	0m14.211s
user	0m0.096s
sys	0m0.179s

[1]-  Done                    time curl -k -XGET https://nagiosxihost.org/nagiosxi/api/v1/objects/servicestatus?fusekey=<FUSEKEY>
[3]+  Done                    user=nagiosadmin

Any other logs or anything else I can check on?


Thanks,


Jason

Re: live data timeout settings

Posted: Wed Jun 02, 2021 5:02 pm
by ssax
Please reboot the fusion server to kill off any old processes, then when it comes back up, please go to Admin > and click the gear icon to clear the polling locks.

Attach this file as well:

Code: Select all

/etc/php.ini
Send a screenshot of your settings in Admin > System Settings > Data & Polling.

Re: live data timeout settings

Posted: Mon Jun 07, 2021 11:18 am
by ssax
Your php.ini looks good.

Please go to Admin > System Settings > Data & Polling and set the Polling Subsystem Memory Limit to -1 (that's minus 1) and change the Simultaneous Pollers to 1.

Then check the box for Mapped Users Polling.

Then wait 5 minutes and let me know if they start polling properly.

What do you see in Admin > Fusion Logs?

Do you see any errors in the files in /usr/local/nagiosfusion/var/log that could be related?

Please note that for the hostgroups/servicegroups poll you may see the live data timeout message but those can be ingored as it's just the way that it's written.

Re: live data timeout settings

Posted: Wed Jun 09, 2021 10:10 am
by nosajche
We are still seeing the following log entries constantly:

Code: Select all

[2021-06-09 02:48:08] [SYSTEM] [ERROR]: poll_server() unable to poll data for s:<XI SERVER>, u:nagiosadmin, poll:servicegroup
[2021-06-09 02:48:08] [SYSTEM] [ERROR]: poll_server() CHECK YOUR LIVE_DATA_TIMEOUT SETTINGS. IT MAY NEED INCREASED
[2021-06-09 04:32:19] [SYSTEM] [ERROR]: poll_server() unable to poll data for s:<XI SERVER>, u:nagiosadmin, poll:servicegroup
[2021-06-09 04:32:19] [SYSTEM] [ERROR]: poll_server() CHECK YOUR LIVE_DATA_TIMEOUT SETTINGS. IT MAY NEED TO BE INCREASED
[2021-06-09 04:37:21] [SYSTEM] [ERROR]: poll_server() unable to poll data for s:<XI SERVER>, u:nagiosadmin, poll:servicegroup
[2021-06-09 04:37:21] [SYSTEM] [ERROR]: poll_server() CHECK YOUR LIVE_DATA_TIMEOUT SETTINGS. IT MAY NEED TO BE INCREASED
[2021-06-09 04:42:24] [SYSTEM] [ERROR]: poll_server() unable to poll data for s:<XI SERVER>, u:nagiosadmin, poll:servicegroup
However, your response would seem to indicate this is a normal thing? Why is that?

Furthermore after making the changes you detailed, the following entries started to show in fusion.log:

Code: Select all

[2021-06-09 10:58:02] [SYSTEM] [ERROR]: insert_polled_data() caught exception: Unable to UPDATE polled_data ( [:polled_data_id] => 7146, [:monitoring_engine] => 0, [:notifications] => 0, [:active_checks] => 0, [:passive_checks] => 0, [:event_handlers] => 0 )
[2021-06-09 10:58:02] [SYSTEM] [ERROR]: insert_polled_data() db error: ( )
The other log files within that /var/log location don't have any other entries that seem helpful.

Re: live data timeout settings

Posted: Thu Jun 10, 2021 10:00 am
by ssax
That's because there is a bug in there but it doesn't impact anything other than you seeing that message.

Try doing this again:

Code: Select all

cd /usr/local/nagiosfusion/scripts
./truncate_polled.php
What is the output of these commands on the Fusion server?

Code: Select all

uname -a
cat /etc/*release
rpm -qa | grep -i maria
rpm -qa | grep -i mysql
dpkg --list | grep -i maria
dpkg --list | grep -i mysql

Re: live data timeout settings

Posted: Thu Jun 10, 2021 2:41 pm
by nosajche

Code: Select all

[root@fusionserver scripts]# uname -a
Linux fusionserver 3.10.0-1160.21.1.el7.x86_64 #1 SMP Mon Feb 22 18:03:13 EST 2021 x86_64 x86_64 x86_64 GNU/Linux

[root@fusionserver scripts]# cat /etc/*release
NAME="Red Hat Enterprise Linux Server"
VERSION="7.9 (Maipo)"
ID="rhel"
ID_LIKE="fedora"
VARIANT="Server"
VARIANT_ID="server"
VERSION_ID="7.9"
PRETTY_NAME=RHEL
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:7.9:GA:server"
HOME_URL="https://www.redhat.com/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 7"
REDHAT_BUGZILLA_PRODUCT_VERSION=7.9
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="7.9"
Red Hat Enterprise Linux Server release 7.9 (Maipo)
Red Hat Enterprise Linux Server release 7.9 (Maipo)

[root@fusionserver scripts]# rpm -qa | grep -i maria
mariadb-5.5.68-1.el7.x86_64
mariadb-libs-5.5.68-1.el7.x86_64
mariadb-devel-5.5.68-1.el7.x86_64
mariadb-server-5.5.68-1.el7.x86_64

[root@fusionserver scripts]# rpm -qa | grep -i mysql
perl-DBD-MySQL-4.023-6.el7.x86_64
php-mysql-5.4.16-48.el7.x86_64

Re: live data timeout settings

Posted: Fri Jun 11, 2021 1:00 pm
by ssax
Please create a ticket for this and include a link back to this forum thread so we can get a remote session setup:

https://support.nagios.com/tickets/

Thank you!