Page 2 of 4

Re: Off-load Database option at install?

Posted: Fri Oct 01, 2021 7:28 am
by TBT
I believe it was successful, there is a lot going on with these scripts but the last few lines of output were:
Performing upgrade...
> Updating dbversion table
Database upgrade is complete!
Job for nagios.service failed because the control process exited with error code.
See "systemctl status nagios.service" and "journalctl -xe" for details.
Finished restore repair OK
I've gone ahead and PM you the profile.

Re: Off-load Database option at install?

Posted: Fri Oct 01, 2021 4:59 pm
by benjaminsmith
Hi TBT,

Thank you for the profile The upgrade was completed but core services are not running. Try doing a full re-boot of the server, and then see if Core is up and running.

Code: Select all

systemclt status nagios
# Can you restart it if it's not running
systemctl restart nagios
If that is failing, what is the output to the following command.

Code: Select all

/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
What does posted in journal -xe? Do you have SELinux enabled?

Code: Select all

sestatus
--Benjamin

Re: Off-load Database option at install?

Posted: Mon Oct 04, 2021 8:16 am
by TBT
benjaminsmith wrote:Hi TBT,

Thank you for the profile The upgrade was completed but core services are not running. Try doing a full re-boot of the server, and then see if Core is up and running.

Code: Select all

systemclt status nagios
# Can you restart it if it's not running
systemctl restart nagios
If that is failing, what is the output to the following command.

Code: Select all

/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
What does posted in journal -xe? Do you have SELinux enabled?

Code: Select all

sestatus
--Benjamin

Code: Select all

 systemclt status nagios
-bash: systemclt: command not found

Code: Select all

 /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

Nagios Core 4.4.6
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 2020-04-28
License: GPL

Website: https://www.nagios.org
Reading configuration data...
Error in configuration file '/usr/local/nagios/etc/nagios.cfg' - Line 88 (Check result path '/mnt/ramdisk/spool/checkresults' is not a valid directory)
   Error processing main config file!
SELinux is not in use.

Re: Off-load Database option at install?

Posted: Mon Oct 04, 2021 4:44 pm
by benjaminsmith
Hi,

Thanks for running those commands, so this is the issue preventing the Nagios Core from starting.
Error in configuration file '/usr/local/nagios/etc/nagios.cfg' - Line 88 (Check result path '/mnt/ramdisk/spool/checkresults' is not a valid directory)
Error processing main config file!
Looks like you have a ramdisk setup, please go through the guide once more and check all of the settings, especially page 5 where the steps are to set up the directories and service file.

https://assets.nagios.com/downloads/nag ... giosXI.pdf

Let us know if you're able to resolve the issue.

--Benjamin

Re: Off-load Database option at install?

Posted: Tue Oct 05, 2021 9:02 am
by TBT
benjaminsmith wrote:Hi,

Thanks for running those commands, so this is the issue preventing the Nagios Core from starting.
Error in configuration file '/usr/local/nagios/etc/nagios.cfg' - Line 88 (Check result path '/mnt/ramdisk/spool/checkresults' is not a valid directory)
Error processing main config file!
Looks like you have a ramdisk setup, please go through the guide once more and check all of the settings, especially page 5 where the steps are to set up the directories and service file.

https://assets.nagios.com/downloads/nag ... giosXI.pdf

Let us know if you're able to resolve the issue.

--Benjamin
Correct, in step 2 I outlined that we're using RAMdisk and was setup on the new system prior to restore.

For installation we ran the Automatic Installation as the documentation states that Manual RAM Disk Installation steps should only be followed if using the install_ramdisk.sh script is not possible.

1. Doesn't the Automatic installation take care of everything past page 2?

2. It seems the restore/repair broke what Auto install setup, if this isn't the case, how was XI working prior?

3. This is starting to feel really messy. Can you please test the restore/repair scripts to confirm they do not break the RAMdisk Automatic installation that was setup with install_ramdisk.sh on a Debian 10 system?

In the meantime, I'll start on page 5 to see if it matches with what the automatic install script did. Our next step may be to start a support call for a remote session.

Re: Off-load Database option at install?

Posted: Tue Oct 05, 2021 12:30 pm
by TBT
Starting on page 5, I compared RAMDisk Manual installation to what we have on the new server. There are a lot of differences, hopefully you can determine if the Automatic Install RAMDisk or the Restore/Repair scripts was the contributing factor and what we should do next.

Shown below is what differs on our system.

Page 6 ramdisk.service

Code: Select all

[Unit]
Description=Ramdisk
Requires=local-fs.target
After=local-fs.target
Before=nagios.service
[Service]
Type=simple
RemainAfterExit=yes
Restart=always
ExecStartPre=/usr/bin/mkdir -p -m 775 /var/nagiosramdisk /var/nagiosramdisk/tmp /var/nagiosramdisk/spool /var/nagiosramdisk/spool/checkresults /var/nagiosramdisk/spool/xidpe /var/nagiosramdisk/spool/perfdata
ExecStartPre=/usr/bin/mount -t tmpfs -o size=1000m tmpfs /var/nagiosramdisk
ExecStartPre=/usr/bin/mkdir -p -m 775 /var/nagiosramdisk /var/nagiosramdisk/tmp /var/nagiosramdisk/spool /var/nagiosramdisk/spool/checkresults /var/nagiosramdisk/spool/xidpe /var/nagiosramdisk/spool/perfdata
ExecStart=/usr/bin/chown -R nagios:nagios /var/nagiosramdisk
[Install]
WantedBy=multi-user.target

Page 7 /usr/local/nagios/etc/nagios.cfg

Code: Select all

#service_perfdata_file=/usr/local/nagios/var/service-perfdata
service_perfdata_file=/mnt/ramdisk/service-perfdata
#host_perfdata_file=/usr/local/nagios/var/host-perfdata
host_perfdata_file=/mnt/ramdisk/host-perfdata
#check_result_path=/usr/local/nagios/var/spool/checkresults
check_result_path=/mnt/ramdisk/spool/checkresults
#object_cache_file=/usr/local/nagios/var/objects.cache
object_cache_file=/mnt/ramdisk/objects.cache
#status_file=/usr/local/nagios/var/status.dat
status_file=/mnt/ramdisk/status.dat
#temp_path=/tmp
temp_path=/mnt/ramdisk/tmp
Page 7 /usr/local/nrdp/server/config.inc.php

Code: Select all

//$cfg["check_results_dir"]="/usr/local/nagios/var/spool/checkresults";
$cfg["check_results_dir"]="/mnt/ramdisk/spool/checkresults";

Page 8 /usr/local/nagios/etc/pnp/npcd.cfg

Code: Select all

perfdata_spool_dir = /mnt/ramdisk/spool/perfdata/
perfdata_spool_dir = /mnt/ramdisk/spool/perfdata/

Re: Off-load Database option at install?

Posted: Wed Oct 06, 2021 10:32 am
by benjaminsmith
Hi,

In the documenation, the example path is set as /var/nagiosramdisk. In this case, the path is /mnt/ramdisk/. Looking over those files, the ramdisk.service file needs an update from the default to /mnt/ramdisk. Please update the service and then restart, and let me know if it's working.

Code: Select all

Description=Ramdisk
Requires=local-fs.target
After=local-fs.target
Before=nagios.service
[Service]
Type=simple
RemainAfterExit=yes
Restart=always
ExecStartPre=/usr/bin/mkdir -p -m 775 /mnt/ramdisk/
ExecStartPre=/usr/bin/mount -t tmpfs -o size=100m tmpfs /mnt/ramdisk/
ExecStartPre=/usr/bin/mkdir -p -m 775 /mnt/ramdisk/tmp
/mnt/ramdisk/spool /mnt/ramdisk/spool/checkresults
/mnt/ramdisk/spool/xidpe /mnt/ramdisk/spool/perfdata
ExecStart=/usr/bin/chown -R nagios:nagios /mnt/ramdisk/
[Install]
WantedBy=multi-user.target
Also, what are the permission on that directory.

Code: Select all

ls -ld /mnt/ramdisk/
ls -ld /mnt/ramdisk/spool/checkresults
--Benjamin

Re: Off-load Database option at install?

Posted: Wed Oct 06, 2021 10:46 am
by TBT
benjaminsmith wrote:Hi,

In the documenation, the example path is set as /var/nagiosramdisk. In this case, the path is /mnt/ramdisk/. Looking over those files, the ramdisk.service file needs an update from the default to /mnt/ramdisk. Please update the service and then restart, and let me know if it's working.

Code: Select all

Description=Ramdisk
Requires=local-fs.target
After=local-fs.target
Before=nagios.service
[Service]
Type=simple
RemainAfterExit=yes
Restart=always
ExecStartPre=/usr/bin/mkdir -p -m 775 /mnt/ramdisk/
ExecStartPre=/usr/bin/mount -t tmpfs -o size=100m tmpfs /mnt/ramdisk/
ExecStartPre=/usr/bin/mkdir -p -m 775 /mnt/ramdisk/tmp
/mnt/ramdisk/spool /mnt/ramdisk/spool/checkresults
/mnt/ramdisk/spool/xidpe /mnt/ramdisk/spool/perfdata
ExecStart=/usr/bin/chown -R nagios:nagios /mnt/ramdisk/
[Install]
WantedBy=multi-user.target
Also, what are the permission on that directory.

Code: Select all

ls -ld /mnt/ramdisk/
ls -ld /mnt/ramdisk/spool/checkresults
--Benjamin
I've modified the ramdisk.service file as per your suggestion, no change to XI.

/mnt/ramdisk/ doesn't exist on the new system but does on the old, perhaps that is carry over from the restore?

Code: Select all

 ls -ld /mnt/ramdisk/
ls: cannot access '/mnt/ramdisk/': No such file or directory

ls -ld /mnt/ramdisk/spool/checkresults
ls: cannot access '/mnt/ramdisk/spool/checkresults': No such file or directory

Re: Off-load Database option at install?

Posted: Thu Oct 07, 2021 10:53 am
by benjaminsmith
HI,
ls -ld /mnt/ramdisk/
ls: cannot access '/mnt/ramdisk/': No such file or directory

ls -ld /mnt/ramdisk/spool/checkresults
ls: cannot access '/mnt/ramdisk/spool/checkresults': No such file or directory
So that is the root of the problem. I would recommend either disabling the ramdisk, setting up the ramdisk using the defaults provided in the documentation or try enabling a symbolic link to that path.

--Benjamin

Re: Off-load Database option at install?

Posted: Thu Oct 07, 2021 11:25 am
by TBT
We have eight production XI servers to migrate, so I'd like to refine the process best we can. Would it be advisable to rollback to Step 2, then follow the RAMdisk Manual Installation substituting /var/nagiosramdisk for /mnt/ramdisk/ and continue with the other steps, would that resolve the issue too?

Once completed, I'll also confirm the ramdisk.service file config is identical to your last suggestion.

Below are the steps.

1. New system: Install the same XI version (5.7.5) as the old system.
2. New system: Setup RAMDisk (Manual Install) and RRDcache as these are in use on the old system.
3. New system: Offload the Database.
4. New system: Backup offloaded modified files. # This step isn't documented
5. Old system: Created a backup using backup_xi.sh and copy over to new system.
6. New system: Run restore using restore_xi.sh
7. New system: Copy offloaded modified files back in place. # This step isn't documented
8. New system: Login to XI, ensure Program URL, External URL and License key are correct. # License key was empty
9. New system: Run repair using restore_repair.sh
10. New system: Modify /etc/postgresql/11/main/postgresql.conf as per your suggestion. # This step isn't documented

Or better yet, perhaps the best way to do this is disable the RAMDisk on the old server prior to backup, then once restored on the new server, run the RAMdisk Automatic install so everything is clean and proper?

Please advise,