Page 1 of 3

Having issues with BPI again, post upgrade to 5.8.5

Posted: Mon Sep 20, 2021 12:33 pm
by rferebee
Having issues with BPI again after our recent upgrade to 5.8.5.

Since the upgrade, every time we apply a new config almost all of our Host Groups switch to 'Unknown' status until we manually re-sync them.

There is an error that shows in the BPI section which states 'Unable to match sting in config file.' I tried manually searching through the config file and I cannot find the issue.

Can someone please assist? Thank you.

I can PM my BPI files whenever you need them.

Re: Having issues with BPI again, post upgrade to 5.8.5

Posted: Mon Sep 20, 2021 4:05 pm
by pbroste
Hello @rferebee

Thanks for reaching out, let's go ahead and get some info. Please send the following to me in PM:

1) /usr/local/nagiosxi/var/components/bpi.xml file
2) /usr/local/nagiosxi/var/components/bpi.log file.

Code: Select all

cat /usr/local/nagiosxi/var/components/bpi.xml /usr/local/nagiosxi/var/components/bpi.log >> /tmp/bpi.txt
Please PM your updated system profile for us to review.

To send us your system profile.
  • Login to the Nagios XI GUI using a web browser.
  • Click the "Admin" > "System Profile" Menu
  • Click the "Download Profile" button
  • Save the profile.zip file and send via Private Message
Thanks,
Perry

Re: Having issues with BPI again, post upgrade to 5.8.5

Posted: Wed Sep 22, 2021 11:10 am
by rferebee
Good morning, just curious if there are any updates to my request?

Re: Having issues with BPI again, post upgrade to 5.8.5

Posted: Wed Sep 22, 2021 1:24 pm
by pbroste
Hello @rferebee

Thanks for pinging me on this and after review we see many messages like;
PHP Warning: Invalid argument supplied for foreach() in /usr/local/nagiosxi/html/includes/components/x>... and....
[:error] [pid 83051] [client 10.144.254.221:55335] PHP Notice: Undefined index
Want to start off by deleting the Core Configurations and then have you re-write, we will call this reindex.

[*]Reindex the Core Configuration Manager (CCM) configs[/*]
  • rm -rf /usr/local/nagios/etc/import/*
  • 1: Terminal command list all running /bin/nagios -> ps -aux | grep -E '/bin/nagios'
  • 2: Terminal command -> killall -9 nagios (or pkill nagios)
  • 3: Terminal command check to see if /bin/nagios processes are stopped
  • 4: Restart nagios.service by terminal command: systemctl restart nagios
  • 5: Head over to the Nagios XI web console ==> Core Configuration Manager (CCM) ==> Config File Management ==> [Delete Files] ==> [Write Files] ==> [Verify Files]
  • 6: Core Configuration Manager (CCM) ==> Under Quick Tools ==> "Apply Configuration"
  • 7: Restart nagios.service by terminal command: systemctl restart nagios
  • [list]
  • Code: Select all

    systemctl restart nagios
[/list]

Verify that the host and services look good and verify that there are no errors in core by:

  • Code: Select all

    /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Let us know how things look,
Perry

Re: Having issues with BPI again, post upgrade to 5.8.5

Posted: Thu Sep 23, 2021 12:39 pm
by lazzarinof
Good morning Perry,

We tried reindexing, as shown above, without success.

We then tried rebuilding the config file completely, by clearing it out and synching the hostgroups & servicegroups. That appeared to do the trick for ~8 hours. However, at some point the config file deleted all non-commented lines, replacing them with blank spaces (and triggering several alert emails). Resynching added the groups again, but left a couple hundred blank lines.

Thank you

Re: Having issues with BPI again, post upgrade to 5.8.5

Posted: Fri Sep 24, 2021 12:07 pm
by ssax
Try increasing Admin > Performance Settings > Subsystem > BPI Sync NDO Startup Timeout to 300.

Try increasing Home > BPI > XML Cache Threshold t0 120.

See if those changes help.

Re: Having issues with BPI again, post upgrade to 5.8.5

Posted: Fri Sep 24, 2021 1:20 pm
by lazzarinof
Good morning ssax,

It deleted itself again this morning (I checked it at 8:30am, and it was still there, so I'm zeroing in on when it deletes!).

I have increased the BPI Sync NDO Startup Timeout to 300, from 120.

The XML Cache Threshold is already at 120, so I'll leave that, and verify again in the morning.

Thank you,
Frank

Re: Having issues with BPI again, post upgrade to 5.8.5

Posted: Mon Sep 27, 2021 10:14 am
by lazzarinof
Good morning ssax,

It looks like the config file is holding, for now. We have had it go a few days without issue, then suddenly revert, but I'm hopeful this will hold!
I'll keep this updated if anything happens.

Thank you,
-Frank

Re: Having issues with BPI again, post upgrade to 5.8.5

Posted: Mon Sep 27, 2021 10:45 am
by pbroste
Hello Frank,

Please let us know how things are looking when you get a chance.

Thanks,
Perry

Re: Having issues with BPI again, post upgrade to 5.8.5

Posted: Mon Sep 27, 2021 11:21 am
by lazzarinof
Good morning pbroste,

I had to update a service alert today. It looks like applying a configuration caused the BPI config file to overwrite itself (several hundred lines of blanks, but all text has been removed). That would explain why we didn't see it over the weekend: no changes were made.

Resyncing the hostgroups and servicegroups re-adds them to the end of the document (leaving the blank lines) and stops the alert spam we get each time (email below: we tend to get a couple dozen each time we update a configuration). For some reason the BPI Config file just won't hold!



Nagios connectivity to all host in the [redacted] Server Group has been lost / restored. (Please see “State” field below for current status).
This message will replace all individual server notifications until connectivity to this zone is restored.
Nagios has detected a problem with this service.

Notification Type: PROBLEM

Service: BPI Process:HG: [redacted]
Host: [redacted]
Address: 127.0.0.1
State: UNKNOWN
Info:
Unknown BPI Group Index
Date/Time: 2021-09-27 09:17:41