Nagios service constantly exited

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
safuanmansor
Posts: 59
Joined: Mon Jul 16, 2018 9:16 pm

Re: Nagios service constantly exited

Post by safuanmansor »

Hi benjaminsmith,

As per the suggestion.
We have increase the nproc value at /etc/security/limits.conf and also at /etc/security/limits.d/20-nproc.conf base on the webpost @ https://www.thegeekdiary.com/how-to-set ... -rhel-567/

Currently the nagios service is not exited and we will future monitored it.
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: Nagios service constantly exited

Post by benjaminsmith »

Hi Safuan,

That's good to hear, we'll keep this open for now and if you have any issues please provide a fresh system profile for us to review.

Thanks,
Benjamin
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
safuanmansor
Posts: 59
Joined: Mon Jul 16, 2018 9:16 pm

Re: Nagios service constantly exited

Post by safuanmansor »

Hi benjamin..
last 1 hour, we are hitting a randomly failed graph on nagios.

Nagios not restared.
No nproc errror.
Yet the graph failed.

We can see that a long flat line on the perfdata with the same value where it suppose to be curved.
dr-db-dr-zone1-rib-rac1-online_banking_concurrent_users3 (1).jpg
Appricate your advice on this, the latest profile is sent thru pm.

Thanks
Safuan
You do not have the required permissions to view the files attached to this post.
safuanmansor
Posts: 59
Joined: Mon Jul 16, 2018 9:16 pm

Re: Nagios service constantly exited

Post by safuanmansor »

We also see the behavior where the check event restarted base on the active check statistic
IMG-20211109-WA0000.jpg
It climb to 38k over a period of time and then slowly reduce until less than 200 and climb up again. Normally the graph not updating during this behaviour.
You do not have the required permissions to view the files attached to this post.
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: Nagios service constantly exited

Post by benjaminsmith »

HI,

This looks like a load issue. I noticed the default service check time out has been increased significantly over the default. Open up /usr/local/nagios/etc/nagios.cfgand change this back to 60 seconds. Service checks that cannot be completed within a reasonable time need to be stopped to avoid too many simultaneous processes.
#service_check_timeout=60
service_check_timeout=600
The defaults on the performance graph have been increased, but I would recommend reducing the max load threshold to 75. Open up /usr/local/nagios/etc/pnp/npcd.cfg, and change the following to:'

Code: Select all

load_threshold = 75.0
Then restart Nagios Core and NPCD

Code: Select all

systemctl restart nagios
systemctl restart npcd
Let me know if that helps. I would recommend that your company starts planning for an additional XI server and break this system up into multiple servers to help decrease the load.

Regards,
--Benjamin
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
safuanmansor
Posts: 59
Joined: Mon Jul 16, 2018 9:16 pm

Re: Nagios service constantly exited

Post by safuanmansor »

Hi benjaminsmith,

The suggesstion of splitting check to another XI is considered but will take some times as it need to go to long internal process.

We hit another error that we just seen today.

[1636517282] NDO-3: ndo_return = 1 (Statement not prepared)
[1636517282] NDO-3: ndo_handle_comment(ndo-handlers.c:618): Unable to bind parameters
[1636517282] NDO-3: Query failed in ndo_empty_queue_comment
[1636517282] NDO-3: ndo_return = 1 (Statement not prepared)
[1636517282] NDO-3: ndo_handle_comment(ndo-handlers.c:634): Unable to bind parameters
[1636517282] NDO-3: Query failed in ndo_empty_queue_comment
[1636517282] NDO-3: ndo_return = 1 (Statement not prepared)
[1636517282] NDO-3: ndo_handle_comment(ndo-handlers.c:618): Unable to bind parameters
[1636517282] NDO-3: Query failed in ndo_empty_queue_comment
[1636517282] NDO-3: Ended event_handler thread
[1636517282] NDO-3: ndo_return = 1 (Statement not prepared)
[1636517282] NDO-3: ndo_handle_comment(ndo-handlers.c:634): Unable to bind parameters
[1636517282] NDO-3: Query failed in ndo_empty_queue_comment
[1636517282] NDO-3: ndo_return = 1 (Statement not prepared)
[1636517282] NDO-3: ndo_handle_contact_notification(ndo-handlers.c:1320): Unable to bind parameters
[1636517282] NDO-3: Query failed in ndo_empty_queue_notification (handle_contact_notification)
[1636517282] NDO-3: ndo_return = 1 (Statement not prepared)
[1636517282] NDO-3: ndo_handle_comment(ndo-handlers.c:618): Unable to bind parameters
[1636517282] NDO-3: Query failed in ndo_empty_queue_comment

The only solution that i saw inside the forum is by downgrading ndo3 to ndo2db.
What is the downside of this downgrade? Do we lost any features on the latest version of nagios?

Base on this below articles
https://support.nagios.com/kb/article/u ... i-885.html. There is a situation where we can upgrade from ndo2db to ndo3. So will reinstallation of ndo3 can be a solution instead of downgrading it?

Thanks ,
Safuan
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: Nagios service constantly exited

Post by benjaminsmith »

Hi Safaun,

Let's try to stop everything a do database repair and restart. Please log in as root and run the following:

Code: Select all

systemctl stop npcd
systemctl stop nagios
systemctl stop ndo2db
systemctl stop crond
pkill -9 -u nagios
echo "truncate table xi_events; truncate table xi_meta; truncate table xi_eventqueue;" | mysql -u root -pnagiosxi nagiosxi
mysqlcheck -f -r -u root -pnagiosxi --all-databases --use-frm
if grep --quiet pgsql /usr/local/nagiosxi/html/config.inc.php; then systemctl stop postgresql; fi;
systemctl restart mariadb
rm -f /usr/local/nagios/var/rw/nagios.cmd
rm -f /usr/local/nagios/var/nagios.lock
rm -f /var/run/nagios.lock
rm -f /usr/local/nagios/var/ndo.sock
rm -f /usr/local/nagios/var/ndo2db.lock
rm -f /var/lib/mrtg/mrtg_l
rm -f /usr/local/nagiosxi/var/*.lock
rm -f /usr/local/nagiosxi/tmp/*.lock
for i in `ipcs -q | grep nagios |awk '{print $2}'`; do ipcrm -q $i; done
pkill python
if grep --quiet pgsql /usr/local/nagiosxi/html/config.inc.php; then service postgresql start; fi;
systemctl restart httpd
systemctl start ndo2db
systemctl start nagios
systemctl start npcd
systemctl start crond
Then check the nagios logs again. If that doesn't work we can try downgrading to ndo2db (it's relatively easy to downgrade and upgrade again at a later date).

# STANDARD DOWNGRADE OF NDO3

Code: Select all

systemctl stop nagios
cd /tmp
rm -rf /tmp/nagiosxi
wget https://assets.nagios.com/downloads/nagiosxi/5/xi-5.6.14.tar.gz
tar zxf xi-5.6.14.tar.gz
cd /tmp/nagiosxi/subcomponents/ndoutils
./install
systemctl enable ndo2db
Then edit your /usr/local/nagios/etc/nagios.cfg and make sure this line is uncommented:

Code: Select all

broker_module=/usr/local/nagios/bin/ndomod.o config_file=/usr/local/nagios/etc/ndomod.cfg
Make sure this line is commented:

Code: Select all

#broker_module=/usr/local/nagios/bin/ndo.so /usr/local/nagios/etc/ndo.cfg
Then start the nagios service:

Code: Select all

systemctl start nagios
systemctl start ndo2db
--Benjamin
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
safuanmansor
Posts: 59
Joined: Mon Jul 16, 2018 9:16 pm

Re: Nagios service constantly exited

Post by safuanmansor »

Hi benjaminsmith,

After applying suggested setting to reduce the the service_check_timeout. Nagios seem to be stable without suddent exited despite the NDO still crashing base on the logs file. The suggestion to downgrade it to ndo2db is a proven workaround as tested on the test server.

Is this the only solution for it at the moment?

Thanks,
Safuan
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: Nagios service constantly exited

Post by benjaminsmith »

Hi Sufaun,
Is this the only solution for it at the moment?
For now, I would recommend staying on ndo2b. We will be making some more updates to ndo3 in a coming release that should help resolve table migration errors and let's try to upgrade again at that time. It's not very difficult to upgrade or downgrade the backend database application.

Let me know if that sounds alright for you.

Regards,
Benjamin
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
safuanmansor
Posts: 59
Joined: Mon Jul 16, 2018 9:16 pm

Re: Nagios service constantly exited

Post by safuanmansor »

Hi Ben,

Yeah , sound right to me. Thanks for the support. You may close this thread.

Regards,
Safuan
Locked