Page 1 of 1

NCPA deployed agent status is "not available"

Posted: Fri Feb 05, 2021 10:37 am
by IT-OPS-SYS
we have the latest version of nagios and we have deployed the ncpa agent on the rhel 7.x machine and it was successful.

after adding the server using the wizard to the nagios the status of the deployed agent is showing "not available".
hostname: GVUBXE03_ProdCore_MEP
logged in to the machine and checked the logs and below are the entry:

Feb 5 09:35:12 guvbxe03-admin ncpa_listener: [FAILED]
Feb 5 09:35:12 guvbxe03-admin systemd: Stopped LSB: This manages the NCPA Listener service.
Feb 5 09:35:12 guvbxe03-admin systemd: Starting LSB: This manages the NCPA Listener service...
Feb 5 09:35:12 guvbxe03-admin ncpa_listener: Starting NCPA Listener: another instance seems to be running (pid 4730), exiting
Feb 5 09:35:12 guvbxe03-admin ncpa_listener: [FAILED]
Feb 5 09:35:12 guvbxe03-admin systemd: Started LSB: This manages the NCPA Listener service.
Feb 5 09:35:14 guvbxe03-admin ansible-package_facts: Invoked with manager=['auto'] strategy=first
Feb 5 09:35:16 guvbxe03-admin ansible-firewalld: Invoked with icmp_block_inversion=None zone=None service=None masquerade=None icmp_block=None immediate=True source=None state=enabled permanent=True timeout=0 interface=None offline=True port=5693/tcp rich_rule=None

ncpa listener service is running fine.

Nagios Xi logs:

[1612539082] wproc: Core Worker 29763: job 1199 (pid=15260): Dormant child reaped
[1612539082] wproc: Core Worker 29773: job 1199 (pid=15324) timed out. Killing it
[1612539082] wproc: CHECK job 1199 from worker Core Worker 29773 timed out after 60.01s
[1612539082] wproc: host=GVUBXE03_ProdCore_MEP; service=Swap Usage;
[1612539082] wproc: early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
[1612539082] Warning: Check of service 'Swap Usage' on host 'GVUBXE03_ProdCore_MEP' timed out after 60.007s!
[1612539082] wproc: Core Worker 29773: job 1199 (pid=15324): Dormant child reaped
[1612539083] wproc: Core Worker 29764: job 1200 (pid=15342) timed out. Killing it
[1612539083] wproc: CHECK job 1200 from worker Core Worker 29764 timed out after 60.01s
[1612539083] wproc: host=GVUBXE03_ProdCore_MEP; service=User Count;
[1612539083] wproc: early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
[1612539083] Warning: Check of service 'User Count' on host 'GVUBXE03_ProdCore_MEP' timed out after 60.008s!
[1612539083] wproc: Core Worker 29764: job 1200 (pid=15342): Dormant child reaped

Re: NCPA deployed agent status is "not available"

Posted: Fri Feb 05, 2021 5:45 pm
by ssax
Please PM me a copy of your profile.zip, you can download it from Admin > System Profile by clicking the Download Profile button.

Those top errors are saying NCPA was already running on there.

What is the output of this command from the XI server against each of those IPs:

Code: Select all

nmap -Pn -p 5693 X.X.X.X

Re: NCPA deployed agent status is "not available"

Posted: Mon Feb 08, 2021 8:37 am
by IT-OPS-SYS
Output is as follows and attached is the system profile

[root@cvrmnagiosxi001 ~]# nmap -Pn -p 5693 149.24.172.172

Starting Nmap 6.47 ( http://nmap.org ) at 2021-02-08 08:35 EST
Nmap scan report for 149.24.172.172
Host is up (0.0100s latency).
PORT STATE SERVICE
5693/tcp open unknown

Nmap done: 1 IP address (1 host up) scanned in 0.06 seconds
[root@cvrmnagiosxi001 ~]# nmap -Pn -p 5693 149.24.41.83

Starting Nmap 6.47 ( http://nmap.org ) at 2021-02-08 08:35 EST
Nmap scan report for 149.24.41.83
Host is up (0.00018s latency).
PORT STATE SERVICE
5693/tcp open unknown

Nmap done: 1 IP address (1 host up) scanned in 0.05 seconds
[root@cvrmnagiosxi001 ~]# nmap -Pn -p 5693 149.24.217.191

Starting Nmap 6.47 ( http://nmap.org ) at 2021-02-08 08:35 EST
Nmap scan report for udctc191.ellucian.com (149.24.217.191)
Host is up (0.00027s latency).
PORT STATE SERVICE
5693/tcp open unknown

Nmap done: 1 IP address (1 host up) scanned in 0.04 seconds
[root@cvrmnagiosxi001 ~]#

Re: NCPA deployed agent status is "not available"

Posted: Mon Feb 08, 2021 5:41 pm
by ssax
Do those remote systems have selinux enabled?

Code: Select all

sestatus
Check your PMs as well, I sent you a message for a command I want you to run.

Re: NCPA deployed agent status is "not available"

Posted: Tue Feb 09, 2021 12:17 am
by IT-OPS-SYS
sestatus for all the machines are given below :

SELinux status: enabled
SELinuxfs mount: /sys/fs/selinux
SELinux root directory: /etc/selinux
Loaded policy name: targeted
Current mode: enforcing
Mode from config file: enforcing
Policy MLS status: enabled
Policy deny_unknown status: allowed
Max kernel policy version: 31

but if selinux is enabled for all systems then why only 1 system deployment agent status is "Not available"

Re: NCPA deployed agent status is "not available"

Posted: Tue Feb 09, 2021 7:19 pm
by ssax
They could have different selinux policies applied.

Do you see any selinux messages in /var/log/messages that could be related to blocking ncpa from functioning?

You can try doing this:

Code: Select all

setenforce 0
systemctl restart ncpa_listener ncpa_passive
Then go to Configure > Deployment Settings and change the Status Check Interval to .01.

Then wait 5 minutes and go back and check the XI server to see if it's showing properly.

Then go to Configure > Deployment Settings and change the Status Check Interval back to 24 (the default) or whatever you'd like.

To re-enable selinux you can do:

Code: Select all

setenforce 1