Page 1 of 2

Fusion server fails to poll Nagios XI

Posted: Fri Feb 04, 2022 9:30 am
by nagios-retail
Hello,

We have setup a new Nagios fusion server.
This server has 3 fused servers
One of them is not polling any data. The fuse test did check out OK
When preforming the curl -XGET https://yyyyyyy/nagiosxi/api/v1/system/ ... XXXXXXXXXX -k -v
Command I get the following return:

* About to connect() to xx-xxxxxx-xx port 443 (#0)
* Trying xx.xx.xx.xx...
* Connected to yy.yyyyyy.yy (xx.xx.xx.xx) port 443 (#0)
* Initializing NSS with certpath: sql:/etc/pki/nssdb
* skipping SSL peer certificate verification
* SSL connection using TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
* Server certificate:
* subject: E=[email protected],CN=yy.yyyyyy.yy.bel..................................................
* start date: Oct 18 12:15:13 2021 GMT
* expire date: Aug 10 13:48:46 2042 GMT
* common name: yy.yyyyyy.yy.country.company.lan
* issuer: CN=Company counntry CA,DC=BEL,DC=company,DC=eu
> GET /nagiosxi/api/v1/system/status?fusekey=XXXXXXXXXXXXXXXXXXXXXXX HTTP/1.1
> User-Agent: curl/7.29.0
> Host: yy.yyyyyy.yy
> Accept: */*
>
< HTTP/1.1 200 OK
< Date: Fri, 04 Feb 2022 14:16:01 GMT
< Server: Apache/2.4.6 (CentOS) OpenSSL/1.0.2k-fips PHP/5.4.16
< X-Powered-By: PHP/5.4.16
< Access-Control-Allow-Origin: *
< Access-Control-Allow-Methods: POST, GET, OPTIONS, DELETE, PUT
< Content-Length: 837
< Content-Type: application/json
<
{"instance_id":"1","instance_name":"localhost","status_update_time":"2022-02-04 15:15:58","program_start_time":"2022-01-30 22:00:03","program_run_time":"407758","program_end_time":"1970-01-01 00:00:01","is_currently_running":"1","process_id":"10551","daemon_mode":"1","last_command_check":"1970-01-01 01:00:00","last_log_rotation":"2022-02-03 23:59:59","notifications_enabled":"1","active_service_checks_enabled":"1","passive_service_checks_enabled":"1","active_host_checks_enabled":"1","passive_host_checks_enabled":"1","event_handlers_enabled":"1","flap_detection_enabled":"1","process_performance_data":"1","obsess_over_hosts":"0","obsess_over_services":"0","modified_host_attributes":"0","modified_service_attributes":"0","global_host_event_handler":"xi_host_event_handler","global_service_event_handler":"xi_service_event_handler"}
* Connection #0 to host yy.yyyyyy.yy left intact

Re: Fusion server fails to poll Nagios XI

Posted: Fri Feb 04, 2022 5:57 pm
by ssax
Make sure you're XI server in Fusion has https:// on it and that it has the /nagiosx/ on the end of it:

Code: Select all

https://yyyyyyy/nagiosxi/
Usually this issue occurs because the Fusion server doesn't trust the XI server's https certificate.

Please send the output of these commands from your Fusion server:

Code: Select all

uname -a
cat /etc/*release
curl -L -vvv 'https://yyyyyyy/nagiosxi/api/v1/system/ ... XXXXXXXXXX'

Re: Fusion server fails to poll Nagios XI

Posted: Mon Feb 07, 2022 2:25 am
by nagios-retail
Hello,

I allready changed from http to https and I a now getting info. However... I still get the follwing error in Fusion Logs:
poll_server() unable to poll data for s:Nagios Some Country (xx-xxxxxx-xx), u:username, poll:nagiosxi_bpi

I also send you the requested data from the commands:
uname -a
cat /etc/*release
curl -L -vvv 'https://xx-xxxx-xx/nagiosxi/api/v1/syst ... XXXXXXXXXX'

Re: Fusion server fails to poll Nagios XI

Posted: Mon Feb 07, 2022 2:59 am
by nagios-retail
I have seen that when running the last command lats part of the output is:

* NSS error -8179 (SEC_ERROR_UNKNOWN_ISSUER)
* Peer's Certificate issuer is not recognized.
* Closing connection 0
curl: (60) Peer's Certificate issuer is not recognized.
More details here: http://curl.haxx.se/docs/sslcerts.html

curl performs SSL certificate verification by default, using a "bundle"
of Certificate Authority (CA) public keys (CA certs). If the default
bundle file isn't adequate, you can specify an alternate file
using the --cacert option.
If this HTTPS server uses a certificate signed by a CA represented in
the bundle, the certificate verification probably failed due to a
problem with the certificate (it might be expired, or the name might
not match the domain name in the URL).
If you'd like to turn off curl's verification of the certificate, use
the -k (or --insecure) option.


When I use the -k option in the command the result is:

> Accept: */*
>
< HTTP/1.1 200 OK
< Date: Mon, 07 Feb 2022 07:57:58 GMT
< Server: Apache/2.4.6 (CentOS) OpenSSL/1.0.2k-fips PHP/5.4.16
< X-Powered-By: PHP/5.4.16
< Access-Control-Allow-Origin: *
< Access-Control-Allow-Methods: POST, GET, OPTIONS, DELETE, PUT
< Content-Length: 32
< Content-Type: application/json
<
{"error":"No API Key provided"}
* Connection #0 to host ap-dco280-na left intact


This would sujest that the error comes from the certificate but i have an other server giving me the same error when I run the command but does not give any errors in the Fusion log.

Re: Fusion server fails to poll Nagios XI

Posted: Mon Feb 07, 2022 10:53 am
by ssax
What version of XI is the one your are seeing that BPI error on?

Please set Trace in Admin > System Settings for Log Level, then wait for the poll to complete and then PM the polling file that contains the error from this directory:

Code: Select all

/usr/local/nagiosfusion/var/log/
Disable the Trace logging after to save disk space.

Not that I think that's the issue here but you can make the Fusion server trust your certificate issuer if you follow this process:

Take the CA certs and put them in individual files in this directory:
- NOTE: They must have a .crt extension on the files

Code: Select all

/etc/pki/ca-trust/source/anchors/
Then run these commands:

Code: Select all

update-ca-trust extract
systemctl restart httpd
Then test it again.

If that still doesn't work, take your CA certs and put into this file (just one on top of the other in the file if you have multiple CA signer certs):

Code: Select all

/etc/openldap/certs/ca.pem
Then add this to your /etc/openldap/ldap.conf:

Code: Select all

TLS_CACERT /etc/openldap/certs/ca.pem
Then restart apache and try again:

Code: Select all

systemctl restart httpd
That should do it.

If that still doesn't resolve it (it should), please PM the output of this command:
- Change your.XI.server before running but it should match what the certificate uses for the FQDN

Code: Select all

echo 'DONE' | openssl s_client -showcerts -connect your.XI.server:443

Re: Fusion server fails to poll Nagios XI

Posted: Tue Feb 08, 2022 4:52 am
by nagios-retail
Hello,

I have followed your instructions up to the point where you asked to put the cert in
/etc/openldap/certs/ca.pem
But I dont have that file.
The only files there are:
cert8.db
key3.db
Password
secmod.db


So it was not possible to add the cert to that file.

Re: Fusion server fails to poll Nagios XI

Posted: Tue Feb 08, 2022 7:17 pm
by ssax
You would create the file.

Re: Fusion server fails to poll Nagios XI

Posted: Wed Feb 09, 2022 4:17 am
by nagios-retail
Still polling problems.

poll_server() unable to poll data for s:Nagios Some-Country (xx-xxxxxx-xx), u:nagiosadmin, poll:nagiosxi_bpi
Outcome mailed to you.

Re: Fusion server fails to poll Nagios XI

Posted: Wed Feb 09, 2022 6:26 pm
by ssax
What version of XI is the one your are seeing that BPI error on?

Please set Trace in Admin > System Settings for Log Level, then wait for the poll to complete and then PM the polling file that contains the error from this directory:

Code: Select all

/usr/local/nagiosfusion/var/log/
I sent you a PM as well.

Re: Fusion server fails to poll Nagios XI

Posted: Thu Feb 10, 2022 4:04 am
by nagios-retail
Hello,

Just a remark.. of the 3 servers that are fused 1 one them is a copy; That makes it that the fuse key is the same.
When i look in the dberrors.log file I see the following:

7: exec(): SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry 'Some Country (xx-xxxxxx-xx)' for key 'servers_uniq'SQL: [69] UPDATE servers SET name = :value WHERE server_id = :server_id LIMIT 1
Params: 2
Key: Name: [6] :value
paramno=-1
name=[6] ":value"
is_param=1
param_type=2
Key: Name: [10] :server_id
paramno=-1
name=[10] ":server_id"
is_param=1
param_type=1


Can this be the cause of the problem? And if so how do i generate a new fuse key?

I have PMed you the log files.