NCPA deployed agent status is "not available"
Posted: Fri Feb 05, 2021 10:37 am
we have the latest version of nagios and we have deployed the ncpa agent on the rhel 7.x machine and it was successful.
after adding the server using the wizard to the nagios the status of the deployed agent is showing "not available".
hostname: GVUBXE03_ProdCore_MEP
logged in to the machine and checked the logs and below are the entry:
Feb 5 09:35:12 guvbxe03-admin ncpa_listener: [FAILED]
Feb 5 09:35:12 guvbxe03-admin systemd: Stopped LSB: This manages the NCPA Listener service.
Feb 5 09:35:12 guvbxe03-admin systemd: Starting LSB: This manages the NCPA Listener service...
Feb 5 09:35:12 guvbxe03-admin ncpa_listener: Starting NCPA Listener: another instance seems to be running (pid 4730), exiting
Feb 5 09:35:12 guvbxe03-admin ncpa_listener: [FAILED]
Feb 5 09:35:12 guvbxe03-admin systemd: Started LSB: This manages the NCPA Listener service.
Feb 5 09:35:14 guvbxe03-admin ansible-package_facts: Invoked with manager=['auto'] strategy=first
Feb 5 09:35:16 guvbxe03-admin ansible-firewalld: Invoked with icmp_block_inversion=None zone=None service=None masquerade=None icmp_block=None immediate=True source=None state=enabled permanent=True timeout=0 interface=None offline=True port=5693/tcp rich_rule=None
ncpa listener service is running fine.
Nagios Xi logs:
[1612539082] wproc: Core Worker 29763: job 1199 (pid=15260): Dormant child reaped
[1612539082] wproc: Core Worker 29773: job 1199 (pid=15324) timed out. Killing it
[1612539082] wproc: CHECK job 1199 from worker Core Worker 29773 timed out after 60.01s
[1612539082] wproc: host=GVUBXE03_ProdCore_MEP; service=Swap Usage;
[1612539082] wproc: early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
[1612539082] Warning: Check of service 'Swap Usage' on host 'GVUBXE03_ProdCore_MEP' timed out after 60.007s!
[1612539082] wproc: Core Worker 29773: job 1199 (pid=15324): Dormant child reaped
[1612539083] wproc: Core Worker 29764: job 1200 (pid=15342) timed out. Killing it
[1612539083] wproc: CHECK job 1200 from worker Core Worker 29764 timed out after 60.01s
[1612539083] wproc: host=GVUBXE03_ProdCore_MEP; service=User Count;
[1612539083] wproc: early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
[1612539083] Warning: Check of service 'User Count' on host 'GVUBXE03_ProdCore_MEP' timed out after 60.008s!
[1612539083] wproc: Core Worker 29764: job 1200 (pid=15342): Dormant child reaped
after adding the server using the wizard to the nagios the status of the deployed agent is showing "not available".
hostname: GVUBXE03_ProdCore_MEP
logged in to the machine and checked the logs and below are the entry:
Feb 5 09:35:12 guvbxe03-admin ncpa_listener: [FAILED]
Feb 5 09:35:12 guvbxe03-admin systemd: Stopped LSB: This manages the NCPA Listener service.
Feb 5 09:35:12 guvbxe03-admin systemd: Starting LSB: This manages the NCPA Listener service...
Feb 5 09:35:12 guvbxe03-admin ncpa_listener: Starting NCPA Listener: another instance seems to be running (pid 4730), exiting
Feb 5 09:35:12 guvbxe03-admin ncpa_listener: [FAILED]
Feb 5 09:35:12 guvbxe03-admin systemd: Started LSB: This manages the NCPA Listener service.
Feb 5 09:35:14 guvbxe03-admin ansible-package_facts: Invoked with manager=['auto'] strategy=first
Feb 5 09:35:16 guvbxe03-admin ansible-firewalld: Invoked with icmp_block_inversion=None zone=None service=None masquerade=None icmp_block=None immediate=True source=None state=enabled permanent=True timeout=0 interface=None offline=True port=5693/tcp rich_rule=None
ncpa listener service is running fine.
Nagios Xi logs:
[1612539082] wproc: Core Worker 29763: job 1199 (pid=15260): Dormant child reaped
[1612539082] wproc: Core Worker 29773: job 1199 (pid=15324) timed out. Killing it
[1612539082] wproc: CHECK job 1199 from worker Core Worker 29773 timed out after 60.01s
[1612539082] wproc: host=GVUBXE03_ProdCore_MEP; service=Swap Usage;
[1612539082] wproc: early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
[1612539082] Warning: Check of service 'Swap Usage' on host 'GVUBXE03_ProdCore_MEP' timed out after 60.007s!
[1612539082] wproc: Core Worker 29773: job 1199 (pid=15324): Dormant child reaped
[1612539083] wproc: Core Worker 29764: job 1200 (pid=15342) timed out. Killing it
[1612539083] wproc: CHECK job 1200 from worker Core Worker 29764 timed out after 60.01s
[1612539083] wproc: host=GVUBXE03_ProdCore_MEP; service=User Count;
[1612539083] wproc: early_timeout=1; exited_ok=0; wait_status=0; error_code=62;
[1612539083] Warning: Check of service 'User Count' on host 'GVUBXE03_ProdCore_MEP' timed out after 60.008s!
[1612539083] wproc: Core Worker 29764: job 1200 (pid=15342): Dormant child reaped