Page 1 of 1
live data timeout settings
Posted: Fri May 14, 2021 4:15 pm
by nosajche
Hello,
We are trying to get our Fusion environment to fuse with Nagios XI and pull our Log Server data and we are running into some problems.
We are running RHEL 7.9 (Maipo) on all boxes:
Fusion - v4.1.8
XI - v5.8.1
Log Server - v2.1.7
After successfully fusing the XI boxes (all green check marks for XI Fused servers and Log Server API authentication), we are seeing the following error message constantly in fusion.log:
Code: Select all
[2021-05-14 17:02:41] [SYSTEM] [ERROR]: poll_server() CHECK YOUR LIVE_DATA_TIMEOUT SETTINGS. IT MAY NEED INCREASED
In addition, dberrors.log is constantly growing with the following message:
Code: Select all
Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction
We have already adjusted the timeout values and changed the Polling config:
Memory Limit - 1 G
Simultaneous Pollers - 4
Live Data Timeout - 300 seconds
We have some of the data generating in the dashlets but not all of the information-- any ideas on what to troubleshoot?
Thanks.
Re: live data timeout settings
Posted: Tue May 18, 2021 9:54 am
by benjaminsmith
HI,
It looks like the system may be having trouble processing all the incoming data, let's run a few tests to confirm this.
1. If these are large systems, try increasing the polling interval beyond 300 seconds. You can set this value globally or on a per-server basis.
2. For testing purposes, try taking the
Live Data Timeout setting, to a very large number (600) to see if that resolves the error message in the fusion log.
3. Try to truncate the polled data from the table so the server will start over with fresh data. Run this as root.
Code: Select all
cd /usr/local/nagiosfusion/scripts
./truncate_polled.php
This will remove all of the temporary data that is used in the Fusion interface so you will have to wait until the next time a poll happens for updated data.
Lastly, double-check the disk space on this server, just to rule that out.
Regards,
Benjamin
Re: live data timeout settings
Posted: Wed Jun 02, 2021 10:12 am
by nosajche
Hi Benjamin,
I've adjusted the polling interval for the larger servers to 600 seconds, the timeout to 600 seconds and ran the truncate_polled.php script but am still receiving the same errors re: data timeout.
Disk space is also not an issue, definitely plenty of that to go around.
Also looking at other threads, I tried the CURL command using the fuse key on the largest XI server and the CURL command completes in pretty good time:
Code: Select all
real 0m14.211s
user 0m0.096s
sys 0m0.179s
[1]- Done time curl -k -XGET https://nagiosxihost.org/nagiosxi/api/v1/objects/servicestatus?fusekey=<FUSEKEY>
[3]+ Done user=nagiosadmin
Any other logs or anything else I can check on?
Thanks,
Jason
Re: live data timeout settings
Posted: Wed Jun 02, 2021 5:02 pm
by ssax
Please reboot the fusion server to kill off any old processes, then when it comes back up, please go to Admin > and click the gear icon to clear the polling locks.
Attach this file as well:
Send a screenshot of your settings in Admin > System Settings > Data & Polling.
Re: live data timeout settings
Posted: Mon Jun 07, 2021 11:18 am
by ssax
Your php.ini looks good.
Please go to Admin > System Settings > Data & Polling and set the Polling Subsystem Memory Limit to -1 (that's minus 1) and change the Simultaneous Pollers to 1.
Then check the box for Mapped Users Polling.
Then wait 5 minutes and let me know if they start polling properly.
What do you see in Admin > Fusion Logs?
Do you see any errors in the files in /usr/local/nagiosfusion/var/log that could be related?
Please note that for the hostgroups/servicegroups poll you may see the live data timeout message but those can be ingored as it's just the way that it's written.
Re: live data timeout settings
Posted: Wed Jun 09, 2021 10:10 am
by nosajche
We are still seeing the following log entries constantly:
Code: Select all
[2021-06-09 02:48:08] [SYSTEM] [ERROR]: poll_server() unable to poll data for s:<XI SERVER>, u:nagiosadmin, poll:servicegroup
[2021-06-09 02:48:08] [SYSTEM] [ERROR]: poll_server() CHECK YOUR LIVE_DATA_TIMEOUT SETTINGS. IT MAY NEED INCREASED
[2021-06-09 04:32:19] [SYSTEM] [ERROR]: poll_server() unable to poll data for s:<XI SERVER>, u:nagiosadmin, poll:servicegroup
[2021-06-09 04:32:19] [SYSTEM] [ERROR]: poll_server() CHECK YOUR LIVE_DATA_TIMEOUT SETTINGS. IT MAY NEED TO BE INCREASED
[2021-06-09 04:37:21] [SYSTEM] [ERROR]: poll_server() unable to poll data for s:<XI SERVER>, u:nagiosadmin, poll:servicegroup
[2021-06-09 04:37:21] [SYSTEM] [ERROR]: poll_server() CHECK YOUR LIVE_DATA_TIMEOUT SETTINGS. IT MAY NEED TO BE INCREASED
[2021-06-09 04:42:24] [SYSTEM] [ERROR]: poll_server() unable to poll data for s:<XI SERVER>, u:nagiosadmin, poll:servicegroup
However, your response would seem to indicate this is a normal thing? Why is that?
Furthermore after making the changes you detailed, the following entries started to show in fusion.log:
Code: Select all
[2021-06-09 10:58:02] [SYSTEM] [ERROR]: insert_polled_data() caught exception: Unable to UPDATE polled_data ( [:polled_data_id] => 7146, [:monitoring_engine] => 0, [:notifications] => 0, [:active_checks] => 0, [:passive_checks] => 0, [:event_handlers] => 0 )
[2021-06-09 10:58:02] [SYSTEM] [ERROR]: insert_polled_data() db error: ( )
The other log files within that /var/log location don't have any other entries that seem helpful.
Re: live data timeout settings
Posted: Thu Jun 10, 2021 10:00 am
by ssax
That's because there is a bug in there but it doesn't impact anything other than you seeing that message.
Try doing this again:
Code: Select all
cd /usr/local/nagiosfusion/scripts
./truncate_polled.php
What is the output of these commands on the Fusion server?
Code: Select all
uname -a
cat /etc/*release
rpm -qa | grep -i maria
rpm -qa | grep -i mysql
dpkg --list | grep -i maria
dpkg --list | grep -i mysql
Re: live data timeout settings
Posted: Thu Jun 10, 2021 2:41 pm
by nosajche
Code: Select all
[root@fusionserver scripts]# uname -a
Linux fusionserver 3.10.0-1160.21.1.el7.x86_64 #1 SMP Mon Feb 22 18:03:13 EST 2021 x86_64 x86_64 x86_64 GNU/Linux
[root@fusionserver scripts]# cat /etc/*release
NAME="Red Hat Enterprise Linux Server"
VERSION="7.9 (Maipo)"
ID="rhel"
ID_LIKE="fedora"
VARIANT="Server"
VARIANT_ID="server"
VERSION_ID="7.9"
PRETTY_NAME=RHEL
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:7.9:GA:server"
HOME_URL="https://www.redhat.com/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 7"
REDHAT_BUGZILLA_PRODUCT_VERSION=7.9
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="7.9"
Red Hat Enterprise Linux Server release 7.9 (Maipo)
Red Hat Enterprise Linux Server release 7.9 (Maipo)
[root@fusionserver scripts]# rpm -qa | grep -i maria
mariadb-5.5.68-1.el7.x86_64
mariadb-libs-5.5.68-1.el7.x86_64
mariadb-devel-5.5.68-1.el7.x86_64
mariadb-server-5.5.68-1.el7.x86_64
[root@fusionserver scripts]# rpm -qa | grep -i mysql
perl-DBD-MySQL-4.023-6.el7.x86_64
php-mysql-5.4.16-48.el7.x86_64
Re: live data timeout settings
Posted: Fri Jun 11, 2021 1:00 pm
by ssax
Please create a ticket for this and include a link back to this forum thread so we can get a remote session setup:
https://support.nagios.com/tickets/
Thank you!