Discrepancy between alerts and Error when connecting to API.

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
arthurkroth
Posts: 9
Joined: Mon Oct 14, 2024 9:54 am

Discrepancy between alerts and Error when connecting to API.

Post by arthurkroth »

Hi all,

I have recently Migrated my Nagios XI server from CentOS 7.9(EOL) to Ubuntu 22.04 LTS; everything has been working smoothly so far.

I have kept both servers running side by side. My CentOS is running Nagios XI 5.11.1, and my Ubuntu 22.04 LTS is running Nagios XI 2024R1.3.

I have noticed that most notifications are duplicated(which is the expected behaviour since I have messages from both servers being sent). Still, some notifications are being sent from one server but not the other. Sometimes, there is a gap of 10/15 minutes for the notifications of the same problem/warning to come through.

Is there any known difference between the versions mentioned regarding the Notification system? Or the way the service is monitored?


Another weird behaviour that I noted was sometimes on my new Nagios(Ubuntu 22.04 LTS/Nagios XI 2024R1.3), I have unknown service status as follows:

Code: Select all

  State: UNKNOWN
  Info:
  UNKNOWN: An error occurred connecting to API. (Connection error: [Errno -3] Temporary failure in name resolution)
  Date/Time: 04/11/2024 11:17:30
Could that happen because I'm running 2 Nagios servers using the same API? I am only running two servers side by side to check the functionality of the new server(Ubuntu 22.04) before decommissioning the old one(CentOS)


Thank you very much for your time :)

Arthur.
User avatar
jmichaelson
Posts: 301
Joined: Wed Aug 23, 2023 1:02 pm

Re: Discrepancy between alerts and Error when connecting to API.

Post by jmichaelson »

Hi Arthur,

Given that you're receiving some notifications but not others, it seems unlikely that it has to do specifically with the back end between both versions. If you wanted to take a deeper dive into that a support ticket might be the best option. (You're correct that the duplicated notifications would be expected since both instances of XI are still running).

I am curious about the one or the other aspect of this. Is that happening both ways? I.e., are both servers sending some notifications that the other one isn't? Or is it just one way?

As for the last weirdness, could you be a little more specific? What kind of service check is it and how is it obtaining the service status? Also, if the service check is via a name instead of an IP address, are the DNS settings identical between the two servers (both the DNS servers, search domains, and /etc/hosts)? If they are, you may want to try and recreate the host and service on the new server if its an easy matter.
Please let us know if you have any other questions or concerns.

-Jason
arthurkroth
Posts: 9
Joined: Mon Oct 14, 2024 9:54 am

Re: Discrepancy between alerts and Error when connecting to API.

Post by arthurkroth »

Hi Folks,

I wanted to provide an update regarding the issues I was facing.

It turns out that both problems were related. Most of the servers I monitored were running NCPA version 2.4.0, which is six versions behind the latest release, version 3.1.1. After updating NCPA on all my servers, I noticed a significant reduction in the discrepancies in notifications.

Additionally, while monitoring my firewall, I observed a large number of dropped packets directed to an internal IP address. Upon investigation, I discovered that this IP was assigned to an old domain controller (DC). Consequently, my new Nagios server was attempting to reach that DC for DNS resolution, which failed to resolve the servers' names, leading to the errors I was encountering:

Code: Select all

  State: UNKNOWN
  Info:
  UNKNOWN: An error occurred connecting to API. [b](Connection error: [Errno -3] Temporary failure in name resolution)[/b]
  Date/Time: 04/11/2024 11:17:30
After I updated the DNS IP to the correct domain controller, the name resolution issue was resolved.

Thanks for your time :)

Arthur.
jsimon
Posts: 295
Joined: Wed Aug 23, 2023 11:27 am

Re: Discrepancy between alerts and Error when connecting to API.

Post by jsimon »

Thanks for the update @arthurkroth! Glad you were able to get your issue resolved.

I'll go ahead and lock this thread.
Locked