Extremely slow logins to Nagios XI WebUI
Posted: Wed May 06, 2020 7:26 pm
I'm migrating our existing Nagios XI installation from RHEL 6 to RHEL 7. I manually installed the program on a VMWare VM, running RHEL 7.8 (Maipo) Kernel 3.10.0-1127.el7.x86_64 following these instructions.
I then migrated the data from the RHEL 6 machine to the RHEL 7 machine following these instructions. The VM is configured with four 1 core CPUs running at 2194.917 MHz, and eight gigs of memory (8009044 kB total, with 6514888 kB free).
The only hangup I had in the migration was that the nagiosxi database did not exist on the RHEL 6 installation. To work around this, I manually created the database, then dumped it, and then added it to the tarball before copying to the RHEL 7 machine.
All services started without error except httpd. For httpd, I had to change a few configurations:
The only other changes I've made to the system were:
Here are log outputs from the RHEL 7 installation showing the time to log in.
There are no utilization issues on the machine (I've watched top, dstat, and iptraf-ng while logging in, and utilization is low across the board).
In other threads, I've seen questions about the number of host and service checks configured, so I got these from Admin > Monitoring Engine Status (I'm not sure if that's the right place to look).
Currently, the RHEL 7 installation is not blocked from communicating to any of the npre agents in the infrastructure. The npre files don't have this server's IP listed, and our firewall doesn't allow the traffic to this server.
To confirm that network issues weren't causing the problem, I shut down each process (listed in Monitoring Engine Status), then retried the login one by one. None of them had any effect on login speed. I then shutdown the DB Backend, Performance Grapher, and finally the Monitoring Engine, and login speeds remained steady at 2:07.
The only meaningful error that I've observed comes from the ssl_error_log
The base DN, account suffix, and controllers are all correct. The LDAP/AD configuration was done through the WebUI, so I'm not sure if there's something else in that file that would need to be configured. I'm not even sure it matters because we connect through Active Directory, and I can still log in using my AD credentials. The problem also occurs when I login using the nagiosadmin user, which isn't a user defined in AD, so it shouldn't reach out to AD at all to validate those credentials.
What else is there to look at to explain these painfully slow logins?
I then migrated the data from the RHEL 6 machine to the RHEL 7 machine following these instructions. The VM is configured with four 1 core CPUs running at 2194.917 MHz, and eight gigs of memory (8009044 kB total, with 6514888 kB free).
The only hangup I had in the migration was that the nagiosxi database did not exist on the RHEL 6 installation. To work around this, I manually created the database, then dumped it, and then added it to the tarball before copying to the RHEL 7 machine.
All services started without error except httpd. For httpd, I had to change a few configurations:
- - Changed "Order allow,deny\Allow from all" to "Require all granted"
- Removed the SSLMutex directive
- Commented out a couple of modules that don't exist in Apache 2.4.6
- - PHP Warning: Invalid argument supplied for foreach() in /usr/local/nagiosxi/html/includes/components/nagiosim/nagiosim.inc.php on line 491
- AH02282: No slotmem from mod_heartmonitor
The only other changes I've made to the system were:
- Changed log file locations, moving nagios core logs, and some others, to /var/log/nagios
Enabled slow query logging on mariadb
Updated the Program URL in Admin > System Settings
Here are log outputs from the RHEL 7 installation showing the time to log in.
Code: Select all
==> ssl_access_log <==
<IP Addr> - - [06/May/2020:17:59:28 -0500] "POST /nagiosxi/login.php HTTP/1.1" 302 -
==> ssl_request_log <==
[06/May/2020:17:59:28 -0500] <IP Addr> TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 "POST /nagiosxi/login.php HTTP/1.1" -
==> ssl_access_log <==
<IP Addr> - - [06/May/2020:18:01:35 -0500] "GET /nagiosxi/index.php HTTP/1.1" 200 41681In other threads, I've seen questions about the number of host and service checks configured, so I got these from Admin > Monitoring Engine Status (I'm not sure if that's the right place to look).
Code: Select all
Active Host Checks -- 1-min: 26, 5-min: 233, 15-min: 336
Active Service Checks -- 1-min: 632, 5-min: 3,191, 15-min: 5,988
No passive checks
To confirm that network issues weren't causing the problem, I shut down each process (listed in Monitoring Engine Status), then retried the login one by one. None of them had any effect on login speed. I then shutdown the DB Backend, Performance Grapher, and finally the Monitoring Engine, and login speeds remained steady at 2:07.
The only meaningful error that I've observed comes from the ssl_error_log
Code: Select all
[Wed May 06 18:23:59.340630 2020] [:error] [pid 18260] [client <IP Addr>:64362] PHP Warning: ldap_bind(): Unable to bind to server: Can't contact LDAP server in /usr/local/nagiosxi/html/includes/components/ldap_ad_integration/adLDAP/src/adLDAP.php on line 714, referer: https://<Nagios URL>/nagiosxi/login.phpWhat else is there to look at to explain these painfully slow logins?