NCPA client stopped working on server reboot

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
djcomber
Posts: 43
Joined: Sat Nov 28, 2015 5:22 am

NCPA client stopped working on server reboot

Post by djcomber »

Hi,

I was getting an alert that one of my ubuntu servers was running low on memory, so I stopped it, added another 1Gb of memory and restarted the server. The NCPA client then failed, it had been working flawlessly for months.

I removed the config, made sure the client was running (it was), added the server again via the NCPA config wizard, still same error. I then when to the NCPS webpage for the server and nothing.

Image

What do I do next, is say run plugin in verbose mode?? not sure how to do that, or if it's the server NCPA client or XI plugin etc..

Thanks in advance
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: NCPA client stopped working on server reboot

Post by lmiltchev »

What is the version of the "check_ncpa.py" plugin that you are currently using?

Code: Select all

/usr/local/nagios/libexec/check_ncpa.py -V
Run one of the failing checks from the command line on the Nagios XI server in "verbose" mode, and show the output of it. Example:

Code: Select all

/usr/local/nagios/libexec/check_ncpa.py -H <client ip> -t <token> -P 5693 -M 'disk/logical/|/used_percent' -w 80 -c 90 -v
Is ncpa service running on the remote machine? Run the following commands on the client, and show the output:

Code: Select all

ps -ef | grep ncpa
/etc/init.d/ncpa_listener restart
/etc/init.d/ncpa_passive restart
ps -ef | grep ncpa
Be sure to check out our Knowledgebase for helpful articles and solutions!
djcomber
Posts: 43
Joined: Sat Nov 28, 2015 5:22 am

Re: NCPA client stopped working on server reboot

Post by djcomber »

Version is

Code: Select all

check_ncpa.py, Version 0.3.5
running verbose command

Code: Select all

Connecting to: https://xx.xx.xx.xx:5693/api/disk/logical/|/used_percent/?token=xxxx&warning=80&critical=90&check=1
An error occurred:[Errno socket error] [Errno 111] Connection refused
I replaces the IP and token with x's

also note the command that XI says it is using

Code: Select all

check_xi_ncpa_agent!-t 'xxxx' -P 5693 -M 'disk/logical/|/used_percent' -w 70 -c 90
results from running commands on monitored server
Image
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: NCPA client stopped working on server reboot

Post by lmiltchev »

By looking at the output of the "ps -ef | grep ncpa" command, I can tell that the ncpa agent is not running even though it said: "Started listener...".

For comparison, here's what I see on my test Ubuntu machine:

Code: Select all

root@ubuntu-test:~# uname -a
Linux ubuntu-test 3.13.0-24-generic #46-Ubuntu SMP Thu Apr 10 19:11:08 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
root@ubuntu-test:~# cat /etc/*release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=14.04
DISTRIB_CODENAME=trusty
DISTRIB_DESCRIPTION="Ubuntu 14.04 LTS"
NAME="Ubuntu"
VERSION="14.04, Trusty Tahr"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 14.04 LTS"
VERSION_ID="14.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
root@ubuntu-test:~# ps -ef | grep ncpa
root      5777     1  0 May10 ?        00:00:04 ./ncpa_posix_listener --start
root      5793     1  0 May10 ?        00:00:02 /usr/local/ncpa/ncpa_posix_passive --start
root      6730  5304  0 09:18 pts/0    00:00:00 grep --color=auto ncpa
How did you install the agent? This ("/root/Development/ncpa/agent/...") is not a "default" path... Did you follow our documentation for installing NCPA?

https://assets.nagios.com/downloads/ncp ... g_NCPA.pdf

I would recommend removing NCPA from this box and reinstalling it. You can download the "Ubuntu/Debian" NCPA agent from here:

https://assets.nagios.com/downloads/ncpa/download.php

Basically, you will need to run (for the 64-bit one):

Code: Select all

cd /tmp
wget https://assets.nagios.com/downloads/ncpa/ncpa-1.8.1-1.amd64.deb
dpkg -i ncpa-1.8.1-1.amd64.deb
Be sure to check out our Knowledgebase for helpful articles and solutions!
djcomber
Posts: 43
Joined: Sat Nov 28, 2015 5:22 am

Re: NCPA client stopped working on server reboot

Post by djcomber »

ok interesting as it was all working prior to the reboot. I didn't install it so can't comment, however it would have been via teh documentation as this was one of our first monitored servers on Nagios.

Will remove and re-install.
djcomber
Posts: 43
Joined: Sat Nov 28, 2015 5:22 am

Re: NCPA client stopped working on server reboot

Post by djcomber »

Uninstall and re-install didn't do anything.

The package is installed in /usr/local/ncpa as it should be and was prior.

I'm not sure what the reference for /root/development/ncpa is but you can see the pifdile is /usr/local.ncpa/var/ncpa_listerner.pid

The output post re-install for /etc/init.d/ncpa_listener restart identical to the screen shot already provide.
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: NCPA client stopped working on server reboot

Post by lmiltchev »

I will have our developers look into this. Meanwhile, can you tell us what is the Ubuntu's version & architecture that you are installing NCPA on?
Be sure to check out our Knowledgebase for helpful articles and solutions!
djcomber
Posts: 43
Joined: Sat Nov 28, 2015 5:22 am

Re: NCPA client stopped working on server reboot

Post by djcomber »

Code: Select all

root@SUCGRDC01OTRS01:~# uname -a
Linux SUCGRDC01OTRS01 3.16.0-60-generic #80~14.04.1-Ubuntu SMP Wed Jan 20 13:37:48 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
root@SUCGRDC01OTRS01:~# cat /etc/*release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=14.04
DISTRIB_CODENAME=trusty
DISTRIB_DESCRIPTION="Ubuntu 14.04.3 LTS"
NAME="Ubuntu"
VERSION="14.04.3 LTS, Trusty Tahr"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 14.04.3 LTS"
VERSION_ID="14.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
root@SUCGRDC01OTRS01:~# ps -ef | grep ncpa
root     26155 26100  0 17:23 pts/0    00:00:00 grep --color=auto ncpa
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: NCPA client stopped working on server reboot

Post by lmiltchev »

Let's try this one more time. Go to the /tmp directory and delete all "old" ncpa files.

Code: Select all

cd /tmp
rm -rf *ncpa*
Make a backup of the ncpa.cfg (if you have any custom checks/entries).

Code: Select all

mv /usr/local/ncpa/etc/ncpa.cfg /tmp/
Remove various ncpa packages/files.

Code: Select all

apt-get remove ncpa
rm -rf /etc/init.d/ncpa* /usr/local/ncpa /var/lib/dpkg/info/ncpa*
update-rc.d -f ncpa_listener remove
update-rc.d -f ncpa_passive remove
This is most probably not needed...but just in case... :)

Code: Select all

killall ncpa_posix_listener
killall ncpa_posix_passive
Reinstall ncpa.

Code: Select all

wget https://assets.nagios.com/downloads/ncpa/ncpa-1.8.1-1.amd64.deb
dpkg -i ncpa-1.8.1-1.amd64.deb
Run the following commands, and show us the output:

Code: Select all

dpkg -l | grep ncpa
ls /etc/init.d/ | grep ncpa
ps -ef | grep ncpa
Be sure to check out our Knowledgebase for helpful articles and solutions!
djcomber
Posts: 43
Joined: Sat Nov 28, 2015 5:22 am

Re: NCPA client stopped working on server reboot

Post by djcomber »

I ran into issues running

Code: Select all

apt-get remove ncpa
So After a failed apt-get -f install I googled and found that the /boot full issue is resolved by deleting old kernal files from /boot. so after I removed all files accept the current kernal versions of all the files in /boot, apt-get -f install ran correctly.

then I proceeded with your instructions and I'm happy to report that after setting the token all was good.

Confirmed by logging into the web interface on the server, saw stats and then XI started reporting again.

Thanks for your help.
Locked