Nagios Support Forum

Posted: **Tue Oct 26, 2021 10:14 am**

We are running Nagios XI 5.8.6 and the monitoring engine keeps dying randomly. Im not finding errors in the nagios log?
Ive done some digging in the forum but all seem to be a case by case basis.

Posted: **Tue Oct 26, 2021 4:47 pm**

Hello @isadmin

Thanks for reaching out, and want to take a look at the System Profile so we can see what is going on.

Code: Select all

journalctl -u nagios.service -o verbose > /tmp/journal.txt

To send us your system profile.

Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and share both '/tmp/journal.txt and profile.zip in a private message

Thanks,
Perry

Posted: **Wed Oct 27, 2021 10:32 am**

Perry
I sent the journal via PM but Profile is over 20mb and cannnot be sent via PM. Is there another way I can get that file to you?

Posted: **Wed Oct 27, 2021 10:35 am**

Perry on a separate note we have to keep deleting the /var/log/php-fpm/www-error.log it keeps filling the drive.

Posted: **Wed Oct 27, 2021 11:30 am**

Hello @isadmin

Thanks for following up, please use the split command and [PM] them separately.

Code: Select all

split -b 45M profile.zip  /tmp/systemprofile

Please send the 'systemprofileincremented_alphabet file in separate PM.

I will look into the '/var/log/php-fpm/www-error.log' issue and follow up.

Thanks,
Perry

Posted: **Wed Oct 27, 2021 11:59 am**

Perry I have PM you all 3 files for the Profile.
Thanks

Posted: **Wed Oct 27, 2021 3:09 pm**

Hello @isadmin

Thanks for sending over the System Profile, in review we see issues where Performance data is going into timeout and then "sig error".

Here is a support article that walks you through the crucial points on how to optimize.

Change the "load_threshold" to 20 in the "/usr/local/nagios/etc/pnp/npcd.cfg" file:

Code: Select all

load_threshold = 20.0

and restart npcd:

Code: Select all

service npcd restart

I am not entirely clear why we see increased logging in the '/var/log/php-fpm/www-error.log.' Perhaps an admin turned (debug or verbose) logging on. Option to verify the configuration for the logging and either disable logging or implement logrotation on it. This will point you to the config location:

Code: Select all

grep -Eir 'www-error' /etc/httpd/ /etc/php* -A 2

If changes are made please bounce the httpd (apache) services.

Thanks,
Perry

Posted: **Wed Oct 27, 2021 3:50 pm**

Thanks Perry I made the changes and turned off logging in the cfg.
;php_admin_value[sendmail_path] = /usr/sbin/sendmail -t -i -f [email protected]
;php_flag[display_errors] = off
;php_admin_value[error_log] = /var/log/php-fpm/www-error.log
;php_admin_flag[log_errors] = on
;php_admin_value[memory_limit] = 128M
We will monitor and see. It usually crashes every few days or so.

Posted: **Thu Oct 28, 2021 8:26 am**

Hello @isadmin

Thanks for following, please let us know how things are looking in a couple of days.

Regards,
Perry

Nagios Support Forum

Monitoring Engines keeps dying randomly

Monitoring Engines keeps dying randomly

Re: Monitoring Engines keeps dying randomly

Re: Monitoring Engines keeps dying randomly

Re: Monitoring Engines keeps dying randomly

Re: Monitoring Engines keeps dying randomly

Re: Monitoring Engines keeps dying randomly

Re: Monitoring Engines keeps dying randomly

Re: Monitoring Engines keeps dying randomly

Re: Monitoring Engines keeps dying randomly