Nagios XI 5.3.8 suddenly needed ixed.7.4ts.lin today
-
- Posts: 8
- Joined: Mon Aug 05, 2019 8:54 am
Nagios XI 5.3.8 suddenly needed ixed.7.4ts.lin today
This morning I logged into our version 5.8.3 Nagios XI host and received a database error. Apparently Nagios decided overnight it wanted ixed.7.4ts.lin instead of ixed.7.4.lin so it was downloaded and copied into /usr/lib64/php-zts/modules as instructed and the entry made in /etc/php.ini as instructed. The next several hours were spent re-downloading from SourceGuardian and re-installing 5.8.3. PHP complained and the /usr/local/nagiosxi/scripts/repair_database.sh script we were told to run wouldn't until I removed the 'extension=ixed.7.4ts.lin' from the php.ini file. The script ran once removed and Nagios restarted happy as a clam but the home page wouldn't render. Kept telling us ixed.7.4ts.lin was needed. Finally I copied the /etc/php.d/sourceguradian.ini file to /etc/php-zts.d/ directory and restarted the application.
That got the nagiosxi home page to render and I could log in. 'php --version' still worked cleanly and the repair scripts worked cleanly. Nice. Not. A couple minutes went by trying to add a hostgroup to an NPRE service we created yesterday and it locked up. Database problems, needed repairing. Logged into the host, ran the /tmp/nagiosxi/upgrade script once again to re-install with ixed.7.4ts.lin in the /etc/php-zts.d/sourceguardian.ini file, restarted apache and nagios, ran 'php --version' and it was clean. Ran /usr/local/nagiosxi/scripts/repairmysql.sh this time, restarted apache and nagios. Got a login page and logged in. False sense of hope ensued. Did I finally get past the problem... No, database access problem, please run the repair scripts. I logged into the support site and am typing this message. Machine is CentOS 7.9.2009 (core). df -h shows '/dev/nvme0n1p1 20G 9.8G 11G 49% /'
Can someone please help me understand what is wrong and help me fix this without losing everything?
That got the nagiosxi home page to render and I could log in. 'php --version' still worked cleanly and the repair scripts worked cleanly. Nice. Not. A couple minutes went by trying to add a hostgroup to an NPRE service we created yesterday and it locked up. Database problems, needed repairing. Logged into the host, ran the /tmp/nagiosxi/upgrade script once again to re-install with ixed.7.4ts.lin in the /etc/php-zts.d/sourceguardian.ini file, restarted apache and nagios, ran 'php --version' and it was clean. Ran /usr/local/nagiosxi/scripts/repairmysql.sh this time, restarted apache and nagios. Got a login page and logged in. False sense of hope ensued. Did I finally get past the problem... No, database access problem, please run the repair scripts. I logged into the support site and am typing this message. Machine is CentOS 7.9.2009 (core). df -h shows '/dev/nvme0n1p1 20G 9.8G 11G 49% /'
Can someone please help me understand what is wrong and help me fix this without losing everything?
Re: Nagios XI 5.3.8 suddenly needed ixed.7.4ts.lin today
Did you have a PHP update installed your sever? I know we had to do php 7.2 and had to make extensive modifications to make it work under RHEL7
-
- Posts: 8
- Joined: Mon Aug 05, 2019 8:54 am
Re: Nagios XI 5.3.8 suddenly needed ixed.7.4ts.lin today
Thanks for that. I didn't before the problem started but at one point late morning I ran 'sudo yum update -y' and got a bunch of PHP 7.4 updates. It didn't affect the problems one way or the other. I did notice on the file that used SourceGuardian (/usr/local/nagiosxi/includes/dbl.inc.php) there was a windows ^M at the end of the line like someone hadn't run the build through dos2unix. But I doubt that makes a difference.
-
- Posts: 5324
- Joined: Wed Aug 22, 2018 4:39 pm
- Location: saint paul
Re: Nagios XI 5.3.8 suddenly needed ixed.7.4ts.lin today
Hi @distracted24x7,
Officially, we support the version of PHP provided by the operating system vendor. If you need to run PHP 7.4, I would recommend migrating to Ubuntu Server 20.04 as that will be much easier to manage is fully tested and supported.
To upgrade PHP, did you follow the guide below?
https://support.nagios.com/kb/article/n ... 7-860.html
Try re-installing source guardian once more on the server.
Nagios XI - Installing Latest SourceGuardian Loaders
Also, please send us the system profile.
To send us your system profile.
Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Regards,
Benjamin
Officially, we support the version of PHP provided by the operating system vendor. If you need to run PHP 7.4, I would recommend migrating to Ubuntu Server 20.04 as that will be much easier to manage is fully tested and supported.
To upgrade PHP, did you follow the guide below?
https://support.nagios.com/kb/article/n ... 7-860.html
Try re-installing source guardian once more on the server.
Nagios XI - Installing Latest SourceGuardian Loaders
Also, please send us the system profile.
To send us your system profile.
Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Regards,
Benjamin
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
-
- Posts: 8
- Joined: Mon Aug 05, 2019 8:54 am
Re: Nagios XI 5.3.8 suddenly needed ixed.7.4ts.lin today
That is most curious. This has been running PHP 7.4 for over a year without a problem. We have gone through multiple upgrades in that time. I followed your 'Upgrading to PHP7' script when we upgraded it.
I tried several times to get a System Profile but cannot. The machine only stays up for a minute or two before getting a Database Error. When I can get to the admin page and push the button for Download Profile, I get black on black text that says
PROFILE BUILD FAILED
Array
(
)
CODE: 1
I tried running the database repair script once again (sudo /usr/local/nagiosxi/scripts/repair_databases.sh) and that says it is fixing a great many things, except the problem. Is there some way I can stop that problem so I can get you what you need?
I tried several times to get a System Profile but cannot. The machine only stays up for a minute or two before getting a Database Error. When I can get to the admin page and push the button for Download Profile, I get black on black text that says
PROFILE BUILD FAILED
Array
(
)
CODE: 1
I tried running the database repair script once again (sudo /usr/local/nagiosxi/scripts/repair_databases.sh) and that says it is fixing a great many things, except the problem. Is there some way I can stop that problem so I can get you what you need?
-
- Posts: 5324
- Joined: Wed Aug 22, 2018 4:39 pm
- Location: saint paul
Re: Nagios XI 5.3.8 suddenly needed ixed.7.4ts.lin today
HI @distracted24x7,
That error message your seeing when trying to download a system profile usually means that the sudoers file is incorrect which can cause other issues as well.
Please try to restore the sudoers file to the default by following the steps in the kb article below.
Nagios XI - Profile Build Failed
If that doesn't work try running the command below to generate a profile from the CLI.
Then send over the resulting /usr/local/nagiosxi/var/components/profile.zip file. If the profile script fails, please post the entire output to the thread.
Thanks,
Benjamin
That error message your seeing when trying to download a system profile usually means that the sudoers file is incorrect which can cause other issues as well.
Please try to restore the sudoers file to the default by following the steps in the kb article below.
Nagios XI - Profile Build Failed
If that doesn't work try running the command below to generate a profile from the CLI.
Code: Select all
rm -rf /usr/local/nagiosxi/var/components/profile.zip
/usr/local/nagiosxi/scripts/components/getprofile.sh SUPPORT
Thanks,
Benjamin
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
-
- Posts: 8
- Joined: Mon Aug 05, 2019 8:54 am
Re: Nagios XI 5.3.8 suddenly needed ixed.7.4ts.lin today
We recently has a PAM authentication Failure for the Nagios user on our XI instance. I don't touch the default sudoers file. I added permissions with custom /etc/sudoers.d files. We have one for nagios and it looks like this now
nagios ALL = NOPASSWD: ALL
NAGIOSXIWEB ALL = NOPASSWD:/usr/local/nagiosxi/scripts/components/getprofile.sh
The database problem has been resolved. We added several hosts and services to our configuration recently so I thought perhaps nagios didn't update the max allowed connections and I checked. Sure enough, allowed connections was 151 and max attempted connections was 152. I added the max_connections=1000 and open_files_limit = 4096 lines to /etc/my.conf and restarted both MariaDB, then Nagios. It has been working since. Running the /tmp/nagiosxi sourceguradian script seems to have taken care of the ixed.7.4ts.lin problem also. I will try to get you a dump of the profile to see if there is anything else we should address but for now, Nagios XI is back to working, save the Download Profile link. Thank for your help Benjamin
--Eric D. Smith
nagios ALL = NOPASSWD: ALL
NAGIOSXIWEB ALL = NOPASSWD:/usr/local/nagiosxi/scripts/components/getprofile.sh
The database problem has been resolved. We added several hosts and services to our configuration recently so I thought perhaps nagios didn't update the max allowed connections and I checked. Sure enough, allowed connections was 151 and max attempted connections was 152. I added the max_connections=1000 and open_files_limit = 4096 lines to /etc/my.conf and restarted both MariaDB, then Nagios. It has been working since. Running the /tmp/nagiosxi sourceguradian script seems to have taken care of the ixed.7.4ts.lin problem also. I will try to get you a dump of the profile to see if there is anything else we should address but for now, Nagios XI is back to working, save the Download Profile link. Thank for your help Benjamin
--Eric D. Smith
-
- Posts: 5324
- Joined: Wed Aug 22, 2018 4:39 pm
- Location: saint paul
Re: Nagios XI 5.3.8 suddenly needed ixed.7.4ts.lin today
Hi Eric,
That's good news. I got the profile and it looks pretty good. I saw the following error in the command subsystem log, but it looks like that cleared up with the changes to the database settings.
The system is not very large, with approximately 500 host and services. If you continue to add more checks, I would increase the number of CPU's.
I would recommend following the article below to increase the timeout and load_threshold for the performance graphs. Be careful with the max_load_threshold as it will consume as much as you give it. I would start by increasing it to 20 or 30.
Let us know if you need any other assistance. Also, if you do not have a test server setup, your Nagios XI allows for 3 activations, production, test and backup. I would recommend setting one up to test out major or minor releases and any other system changes.
See: https://support.nagios.com/kb/article.php?id=145
Best Regards,
Benjamin
That's good news. I got the profile and it looks pretty good. I saw the following error in the command subsystem log, but it looks like that cleared up with the changes to the database settings.
Code: Select all
<p><pre>SQL Error [nagiosxi] : MySQL server has gone away</pre></p>
I would recommend following the article below to increase the timeout and load_threshold for the performance graphs. Be careful with the max_load_threshold as it will consume as much as you give it. I would start by increasing it to 20 or 30.
Code: Select all
[04-14-2021 08:11:11] NPCD: WARN: MAX load reached: load 30.410000/10.000000 at i=1
See: https://support.nagios.com/kb/article.php?id=145
Best Regards,
Benjamin
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
-
- Posts: 8
- Joined: Mon Aug 05, 2019 8:54 am
Re: Nagios XI 5.3.8 suddenly needed ixed.7.4ts.lin today
Thanks man. I really appreciate you helping me out.
I increased the TIMEOUT value from 5 to 20 in /usr/local/nagios/etc/pnp/process_perfdata.cfg
I also increased the load_threshold from 10.0 to 20.0 in /usr/local/nagios/etc/pnp/npcd.cfg
Please let me know if I did either of these wrong. I restarted nagios after making those changes. The MySQL server has gone away issue was me with a typo in 'max_connections' that kept mysql down for a few minutes. Ooops
One thing I am getting some complaint about is the graphs for most services don't have nearly the detail as the old nagios core machines or CloudWatch. Hopefully this helps with that some.
I increased the TIMEOUT value from 5 to 20 in /usr/local/nagios/etc/pnp/process_perfdata.cfg
I also increased the load_threshold from 10.0 to 20.0 in /usr/local/nagios/etc/pnp/npcd.cfg
Please let me know if I did either of these wrong. I restarted nagios after making those changes. The MySQL server has gone away issue was me with a typo in 'max_connections' that kept mysql down for a few minutes. Ooops
One thing I am getting some complaint about is the graphs for most services don't have nearly the detail as the old nagios core machines or CloudWatch. Hopefully this helps with that some.
-
- Posts: 5324
- Joined: Wed Aug 22, 2018 4:39 pm
- Location: saint paul
Re: Nagios XI 5.3.8 suddenly needed ixed.7.4ts.lin today
Hi @distracted24x7,
I would think the graph would be the same between Core and XI, but there could be some difference in the RRD graph setup between the two systems.
The following kb article discusses how the graphs work. The data is stored in rrd files and how the data is averaged out over time.
Nagios XI - Performance Data Averaging
--Benjamin
I would think the graph would be the same between Core and XI, but there could be some difference in the RRD graph setup between the two systems.
The following kb article discusses how the graphs work. The data is stored in rrd files and how the data is averaged out over time.
Nagios XI - Performance Data Averaging
--Benjamin
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!