NCPA 2.1.6 on AIX is doing a dump on start

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
Keystone
Posts: 28
Joined: Wed Jan 17, 2018 12:09 pm

NCPA 2.1.6 on AIX is doing a dump on start

Post by Keystone »

AIX 7.1 (7100-05-* to be exact)

After upgrading from 2.0.6-4 to 2.1.6 we are seeing this type of issue:
https://github.com/NagiosEnterprises/ncpa/issues/504

Even after running this:
chown -R nagios /usr/local/ncpa
chgrp -R nagios /usr/local/ncpa
...to correct the ((new default) root:system) ownership issue that occurred during the upgrade. We still have the issue where we can get the "ncpa_passive" to run, but not the "ncpa_listener"

This exists: /usr/local/ncpa/var/log/ncpa_passive.log
This file never gets created: /usr/local/ncpa/var/log/ncpa_listener.log

As the nagios user I've confirmed that there isn't a permissions issue for the nagios user writing to the "/usr/local/ncpa/var/log/ncpa_listener.log" file by:
date > /usr/local/ncpa/var/log/ncpa_listener.log
Note, I've already corrected the botch user/group permissions.

When when we start it by root:
$ /usr/local/ncpa/ncpa_listener --nodaemon --start
Memory fault(coredump)

When we analyze the core file
$ /usr/lib/ras/check_core /core |tail -1
ncpa_listener

So currently we are stuck with the limitations of 2.0, desperately need to avoid the bugs from 2.1.1, and have no upgrade path. Please help.
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: NCPA 2.1.6 on AIX is doing a dump on start

Post by tgriep »

On AIX, you cannot start the NCPA agent like you do on a Linux system.
Doing that could cause issues.

To start NCPA, run this

Code: Select all

startsrc -e LIBPATH=/usr/local/ncpa -s ncpa_listener
startsrc -e LIBPATH=/usr/local/ncpa -s ncpa_passive
To stop NCPA, run this

Code: Select all

stopsrc -s ncpa_listener -f
stopsrc -s ncpa_passive -f 
Try starting it using the above example and it is fails, post the log file so we can view it.

Code: Select all

/usr/local/ncpa/var/log/ncpa_listener.log
Be sure to check out our Knowledgebase for helpful articles and solutions!
Keystone
Posts: 28
Joined: Wed Jan 17, 2018 12:09 pm

Re: NCPA 2.1.6 on AIX is doing a dump on start

Post by Keystone »

Had the AIX engineer run:
stopsrc -s ncpa_listener -f
stopsrc -s ncpa_passive -f
startsrc -e LIBPATH=/usr/local/ncpa -s ncpa_listener
startsrc -e LIBPATH=/usr/local/ncpa -s ncpa_passive


The /usr/local/ncpa/var/log/ncpa_listener.log still has no data in it. Nothing is ever being written to it. The only reason any file exist is that as the nagios user I did a "touch /usr/local/ncpa/var/log/ncpa_listener.log".


After he ran the above commands, a "ps -ef | grep ncpa" only shows one line:
nagios 13369508 3866810 0 15:39:47 - 0:00 /usr/local/ncpa/ncpa_passive -n

This is what the engineer showed me:
$ startsrc -e LIBPATH=/usr/local/ncpa -s ncpa_listener
0513-059 The ncpa_listener Subsystem has been started. Subsystem PID is 15794308.
root@<boxNameHidden>:/
$ startsrc -e LIBPATH=/usr/local/ncpa -s ncpa_passive
0513-059 The ncpa_passive Subsystem has been started. Subsystem PID is 13369508.
root@<boxNameHidden>:/
$
root@<boxNameHidden>/
$ lssrc -a |grep -i ncpa
ncpa_passive 13369508 active
ncpa_listener inoperative
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: NCPA 2.1.6 on AIX is doing a dump on start

Post by tgriep »

Lets see if we can get the listener to output any errors and not a core dump.

Change to the /usr/local/ncpa folder and run the agent without any command line options like this.

Code: Select all

./ncpa_listener
If it outputs any errors or messages, post them here.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Keystone
Posts: 28
Joined: Wed Jan 17, 2018 12:09 pm

Re: NCPA 2.1.6 on AIX is doing a dump on start

Post by Keystone »

AIX admin stated:
This is all I see when I run the below command.

root@sandboxrk1p:/usr/local/ncpa
$ ./ncpa_listener
Memory fault(coredump)
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: NCPA 2.1.6 on AIX is doing a dump on start

Post by tgriep »

I suggest removing the agent from the server and delete the whole directory where the agent was installed and then install the newest agent again.
I have a feeling the upgrade did not update all of the files and it is using the old files from the folder and causing the core dump.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked