Page 1 of 2

service already deleted in CCM but Nagios still monitoring?!

Posted: Wed Jan 22, 2020 12:55 pm
by xpertech
The service already been deleted in CCM, but NagiosXI still monitoring?!

l HOST : DEFRALCCDAP11P
service : CCDB ActiveMq defralccdap11p GREGEMAIL.QUEUE

l HOST : DEFRALCCDAP12P
service : CCDB ActiveMq defralccdap12p GREGEMAIL.QUEUE

Re: service already deleted in CCM but Nagios still monitori

Posted: Wed Jan 22, 2020 2:37 pm
by scottwilkerson
Can we confirm that we don't have multiple nagios parent processes running

Code: Select all

ps -ef|grep nagios.cfg
This could also be the cause of your other thread
https://support.nagios.com/forum/viewto ... 191#302109

Re: service already deleted in CCM but Nagios still monitori

Posted: Thu Jan 30, 2020 12:17 pm
by xpertech
scottwilkerson wrote:Can we confirm that we don't have multiple nagios parent processes running

Code: Select all

ps -ef|grep nagios.cfg
This could also be the cause of your other thread
https://support.nagios.com/forum/viewto ... 191#302109
after executed the command ... (please see PM also)

Re: service already deleted in CCM but Nagios still monitori

Posted: Thu Jan 30, 2020 5:13 pm
by tgriep
It could be that the objects in the Core Config Manager are not in sync with the running config and that is why you see the hosts in the Details menu.

To fix that, login to the XI GUI and go to the Core Config Manager
Under "Tools", click "Write Config Files" or if you are running a newer versions of XI, The menu is called "Config File Management"
Click on the "Write" button, then the "Delete" button then click the "Write" button and then the "Verify" button.
If you get any errors, resolve them and click on the "Delete" button, "Write", "Verify" until all of the errors are resolved.
Click the Apply Configuration link and click the "Apply Configuration" button after ALL of the errors are resolved.

Re: service already deleted in CCM but Nagios still monitori

Posted: Thu Jan 30, 2020 5:22 pm
by tgriep
I just found your PM from earlier and it looks like there is a stuck reconfigure process and a lot of defunct checks.

Either reboot the server or Run these commands to stop the processes, clean and repair the SQL database and to restart the processes. Run them all as root.

Code: Select all

service npcd stop
service nagios stop
service ndo2db stop
service crond stop
pkill -9 -u nagios
echo "truncate table xi_events; truncate table xi_meta; truncate table xi_eventqueue;" | mysql -u root -pnagiosxi nagiosxi
mysqlcheck -f -r -u root -pnagiosxi --all-databases --use-frm
service mysqld restart
rm -f /usr/local/nagios/var/rw/nagios.cmd
rm -f /usr/local/nagios/var/nagios.lock
rm -f /var/run/nagios.lock
rm -f /usr/local/nagios/var/ndo.sock
rm -f /usr/local/nagios/var/ndo2db.lock
rm -f /var/lib/mrtg/mrtg_l
rm -f /usr/local/nagiosxi/var/*.lock
for i in `ipcs -q | grep nagios |awk '{print $2}'`; do ipcrm -q $i; done
pkill python
service httpd restart
service ndo2db start
service nagios start
service npcd start
service crond start

Re: service already deleted in CCM but Nagios still monitori

Posted: Mon Feb 03, 2020 9:44 am
by xpertech
tgriep wrote:I just found your PM from earlier and it looks like there is a stuck reconfigure process and a lot of defunct checks.

Either reboot the server or Run these commands to stop the processes, clean and repair the SQL database and to restart the processes. Run them all as root.

Code: Select all

service npcd stop
service nagios stop
service ndo2db stop
service crond stop
pkill -9 -u nagios
echo "truncate table xi_events; truncate table xi_meta; truncate table xi_eventqueue;" | mysql -u root -pnagiosxi nagiosxi
mysqlcheck -f -r -u root -pnagiosxi --all-databases --use-frm
service mysqld restart
rm -f /usr/local/nagios/var/rw/nagios.cmd
rm -f /usr/local/nagios/var/nagios.lock
rm -f /var/run/nagios.lock
rm -f /usr/local/nagios/var/ndo.sock
rm -f /usr/local/nagios/var/ndo2db.lock
rm -f /var/lib/mrtg/mrtg_l
rm -f /usr/local/nagiosxi/var/*.lock
for i in `ipcs -q | grep nagios |awk '{print $2}'`; do ipcrm -q $i; done
pkill python
service httpd restart
service ndo2db start
service nagios start
service npcd start
service crond start

The cpu down after manual done with your recommend command steps, but it still will up again frequently. Here are the records ...

Date cpu-high cpu-low
1/13 12:11 12:12 (cpu auto down)
1/16 12:10 12:11 (cpu auto down)
1/19 12:09 14:14 (cpu auto down)
1/20 06:32 06:40 (cpu auto down)
1/20 06:42 07:04 (cpu auto down)
1/21 12:10 12:14 (cpu auto down)
1/22 04:13 09:50 (need to manual execute commands)
1/30 12:15 12:16 (cpu auto down)
1/31 12:08 12:20 (cpu auto down)

is it possible to locate the problem from logs/configs/database/...?

Re: service already deleted in CCM but Nagios still monitori

Posted: Mon Feb 03, 2020 10:17 am
by tgriep
Did the service go away after running the commands?

I don't understand what you mean by this.
The cpu down after manual done with your recommend command steps, but it still will up again frequently.
Date cpu-high cpu-low
1/13 12:11 12:12 (cpu auto down)
......
Can you provide more details?

The current Nagios log file is

Code: Select all

/usr/local/nagios/var/nagios.log
and the historical log files are in this folder.

Code: Select all

/usr/local/nagios/var/archives
You can look at them if needed.

Re: service already deleted in CCM but Nagios still monitori

Posted: Mon Feb 03, 2020 11:36 am
by xpertech
tgriep wrote:Did the service go away after running the commands?

I don't understand what you mean by this.
The cpu down after manual done with your recommend command steps, but it still will up again frequently.
Date cpu-high cpu-low
1/13 12:11 12:12 (cpu auto down)
......
Can you provide more details?

The current Nagios log file is

Code: Select all

/usr/local/nagios/var/nagios.log
and the historical log files are in this folder.

Code: Select all

/usr/local/nagios/var/archives
You can look at them if needed.
"The cpu down after manual done with your recommend command steps, but it still will up again frequently.
Date cpu-high cpu-low
1/13 12:11 12:12 (cpu auto down)
......"

that mean the cpu high will automatically down after a few minutes or hours later,
on 1/13 12:11 the cpu went high and one minute later(12:12) it went down automatically,
that happened almost every one or three days.

Re: service already deleted in CCM but Nagios still monitori

Posted: Mon Feb 03, 2020 3:40 pm
by tgriep
I guess that you mean that after restarting the nagios process that the CPU load was high for a minutes and then is went low, is that correct?

That is normal as the nagios process it re-saving all of the updates , re-scheduling the checks, etc. after it is restarted.
When some one applies the config, that restarts the process as well to that could be why you see it once in a while.

Re: service already deleted in CCM but Nagios still monitori

Posted: Tue Feb 04, 2020 10:17 am
by xpertech
tgriep wrote:I guess that you mean that after restarting the nagios process that the CPU load was high for a minutes and then is went low, is that correct?

That is normal as the nagios process it re-saving all of the updates , re-scheduling the checks, etc. after it is restarted.
When some one applies the config, that restarts the process as well to that could be why you see it once in a while.

I mean we would like to know ...
1. why sometimes the cpu went high for some minutes and then went down automatically, what caused that happened?
2. why sometimes it won't go down and have to manually do something to let it down?!