Page 2 of 3

Re: Having issues with BPI again, post upgrade to 5.8.5

Posted: Tue Sep 28, 2021 9:06 am
by pbroste
Hello @rferebee

Thanks for reaching out, let's go ahead and get additional info. Please send the following to me in Private Message [PM]:

1) /usr/local/nagiosxi/var/components/bpi.xml file
2) /usr/local/nagiosxi/var/components/bpi.log file.

Code: Select all

cat /usr/local/nagiosxi/var/components/bpi.xml /usr/local/nagiosxi/var/components/bpi.log >> /tmp/bpi.txt
Please PM your updated system profile for us to review, depending on your company's security and privacy policies.

To send us your system profile.
  • Login to the Nagios XI GUI using a web browser.
  • Click the "Admin" > "System Profile" Menu
  • Click the "Download Profile" button
  • Save the profile.zip file and send via Private Message
Thanks,
Perry

Re: Having issues with BPI again, post upgrade to 5.8.5

Posted: Wed Sep 29, 2021 11:44 am
by lazzarinof
Good morning Perry,

Those have been sent, and should be in your inbox shortly.

Thank you,
-Frank

Re: Having issues with BPI again, post upgrade to 5.8.5

Posted: Thu Sep 30, 2021 2:22 pm
by pbroste
Hello @lazzarinof

Thanks for sending the System Profile along with the bpi details.

Appears that the messages (examples:) '<br />Error: Can't find the host: appdev1, check configuration for group: 'State-Web'<br /> in 'bpi.log' are suggesting that the host is not able to find the group that it is associated with.

We see that the database is marked as "crashed" and want to have you run through the database repair. Here is the example line that we see:
"210929 9:01:32 [ERROR] mysqld: Table './nagios/nagios_logentries' is marked as crashed and last (automatic?) repair failed"

To repair the database:
  • Code: Select all

    /usr/local/nagiosxi/scripts/repair_databases.sh
And since we are receiving error on 'nagios_logentries' we want to truncate by:

Code: Select all

mysql -u ndoutils -pn@gweb nagios -e 'TRUNCATE TABLE nagios_logentries'
mysql -u ndoutils -pn@gweb nagios -e 'TRUNCATE TABLE nagios_notifications'
Let's also check on the table sizes as well:

Code: Select all

echo "SELECT table_name AS 'Table', round(((data_length + index_length) / 1024 / 1024), 2) 'Size in MB' FROM information_schema.TABLES WHERE table_schema IN ('nagios', 'nagiosql', 'nagiosxi');" | mysql -h 127.0.0.1 -uroot -pnagiosxi --table
Reindex the Core Configuration Manager (CCM) configs
  • 1: Terminal command list all running /bin/nagios -> ps -aux | grep -E '/bin/nagios'
  • 2: Terminal command -> killall -9 nagios (or pkill nagios)
  • 3: Terminal command check to see if /bin/nagios processes are stopped
  • 4: Restart nagios.service by terminal command: systemctl restart nagios
  • 5: Head over to the Nagios XI web console ==> Core Configuration Manager (CCM) ==> Config File Management ==> [Delete Files] ==> [Write Files] ==> [Verify Files]
  • 6: Core Configuration Manager (CCM) ==> Under Quick Tools ==> "Apply Configuration"
  • 7: Restart nagios.service by terminal command: systemctl restart nagios
  • [list]
  • Code: Select all

    systemctl restart nagios
[/list]

Verify that the host and services look good in pre-flight with no errors in core by:
  • Code: Select all

    /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Let us know how things look,
Perry

Re: Having issues with BPI again, post upgrade to 5.8.5

Posted: Fri Oct 01, 2021 12:51 pm
by lazzarinof
Good morning Perry,

We are having the same results, after running the indicated commands.

For now, we've found that unchecking:
BPI → Settings →"Sync all hostgroups and servicegroups on apply config."
at least allows our groups to not overwrite themselves, and prevents us from getting spammed each time we update a host or service.

However, we'd like to eventually have the groups synching automatically.

Thank you,
-Frank

Re: Having issues with BPI again, post upgrade to 5.8.5

Posted: Mon Oct 04, 2021 4:16 pm
by pbroste
Hello @lazzarinof

Thanks for following up, we have a team meeting on Tuesday morning and will talk this over with other colleagues on our team for further suggestions and let you know the next steps on resolution.

Thanks,
Perry

Re: Having issues with BPI again, post upgrade to 5.8.5

Posted: Tue Oct 05, 2021 12:35 pm
by pbroste
Hello @lazzarinof

Following up on this after talking this over with colleagues and reviewing the System Profile we see that after Core Configuration Manager completes the Pre-Flight check we are missing the BPI sync results. Did you turn this off by disabling it?

Code: Select all

Checking objects...

-  Checked 10 hosts.
-  Checked 1 host groups.
-  Checked 0 service groups.
-  Checked 3 contacts.
-  Checked 2 contact groups.
-  Checked 139 commands.
-  Checked 9 time periods.
-  Checked 0 host escalations.
-  Checked 0 service escalations.
-Checking for circular paths...
-  Checked 10 hosts
-  Checked 0 service dependencies
-  Checked 0 host dependencies
-  Checked 9 timeperiods
-Checking global event handlers...
-Checking obsessive compulsive processor commands...
-Checking misc settings...
-
-Total Warnings: 3
-Total Errors:   0
-
-Things look okay - No serious problems were detected during the pre-flight check
-> Return Code: 0
---------------------------------------
-OUTPUT=--------------------------------------
-RETURNCODE=0
-PROCESSING COMMAND ID 367...
-PROCESS COMMAND: CMD=1160, DATA=
-CMDLINE=/bin/true
-OUTPUT=
-RETURNCODE=0
-PROCESSING COMMAND ID 368...
-PROCESS COMMAND: CMD=1150, DATA=remove
:CMDLINE=php /usr/local/nagiosxi/html/includes/components/nagiosbpi/api_tool.php --cmd=syncall
:CMD: syncall
-MSG: BPI configuration applied successfully! BPI configuration applied successfully!
-OUTPUT=MSG: BPI configuration applied successfully! BPI configuration applied successfully!
-RETURNCODE=0

Please go review the settings in the web console under Admin > Performance Settings > Enable BPI sync on Apply Config (......

The recommendation to edit this file and then run the 'syncall' to get results (/usr/local/nagiosxi/var/cmdsubsys.log)

Code: Select all

/usr/local/nagiosxi/html/includes/components/nagiosbpi/api_tool.php
Change this (line 287):

Code: Select all

    $config_string = process_post($arr, $add);
To:

Code: Select all

    $vars = array();
    $config_string = process_post($arr, $add, $vars);
Let us know the results,
Perry

Re: Having issues with BPI again, post upgrade to 5.8.5

Posted: Wed Oct 06, 2021 11:11 am
by lazzarinof
Good morning Perry,

It looks like that change had been made previously, as it is our current setting:
$vars = array();
$config_string = process_post($arr, $add, $vars);

For the BPI Sync: we have disabled that, as if it is enabled, it deletes the BPI Hostgroups and ServiceGroups any time a config is saved, and then triggers a warning email for each hostgroup/service group it expected to find (we then have to manually sync the hostgroups and servicegroups, to stop the alerts).

Thank you,
Frank

Re: Having issues with BPI again, post upgrade to 5.8.5

Posted: Thu Oct 07, 2021 1:02 pm
by pbroste
Hello @lazzarinof

Want to go ahead and increase the NDO startup timeout to 400:
  • Admin > Performance Settings > Subsystem > BPI Sync NDO Startup Timeout to 400.
Let's get a copy of your php.ini config as well:

Code: Select all

tar -czvf /tmp/phpconfig.tar.gz /etc/php.ini
Let us know how things look after the increase. You will need to enable the bpi sync and if things go wonky on the sync you will need to restore it. You can disable Event Handlers in Admin > Monitoring Engine Status so that alerts are not sent out during the test.

Thanks,
Perry

Re: Having issues with BPI again, post upgrade to 5.8.5

Posted: Thu Oct 07, 2021 4:07 pm
by lazzarinof
Good afternoon Perry,

Okay, I've updated the BPI Sync NDO Startup Timeout value from 300 to 400 seconds. It looks like now Hostgroups are working, but Servicegroups are still dropping. I upped to 500 seconds, to test: same results (Servicegroups were completely blank until manually sync'd). I returned the BPI Sync to 400 and re-disabled "Sync all hostgroups and servicegroups on apply config."

I've attached the PHP log.


Thank you,
-Frank

Re: Having issues with BPI again, post upgrade to 5.8.5

Posted: Fri Oct 08, 2021 12:31 pm
by pbroste
Hello @lazzarinof

Took a look at this will Tom and want to increase to 600 as we are seeing timeouts.

Also, let's increase the following in '/etc/php.ini'

Code: Select all

max_execution_time = 300
max_input_time = 300
Bounce the nagios and httpd services and sync.

Please let us know the results,
Perry