Page 1 of 1
Monitoring Engines keeps dying randomly
Posted: Tue Oct 26, 2021 10:14 am
by isadmin
We are running Nagios XI 5.8.6 and the monitoring engine keeps dying randomly. Im not finding errors in the nagios log?
Ive done some digging in the forum but all seem to be a case by case basis.
Re: Monitoring Engines keeps dying randomly
Posted: Tue Oct 26, 2021 4:47 pm
by pbroste
Hello
@isadmin
Thanks for reaching out, and want to take a look at the System Profile so we can see what is going on.
Code: Select all
journalctl -u nagios.service -o verbose > /tmp/journal.txt
To send us your system profile.
- Login to the Nagios XI GUI using a web browser.
- Click the "Admin" > "System Profile" Menu
- Click the "Download Profile" button
- Save the profile.zip file and share both '/tmp/journal.txt and profile.zip in a private message
Thanks,
Perry
Re: Monitoring Engines keeps dying randomly
Posted: Wed Oct 27, 2021 10:32 am
by isadmin
Perry
I sent the journal via PM but Profile is over 20mb and cannnot be sent via PM. Is there another way I can get that file to you?
Re: Monitoring Engines keeps dying randomly
Posted: Wed Oct 27, 2021 10:35 am
by isadmin
Perry on a separate note we have to keep deleting the /var/log/php-fpm/www-error.log it keeps filling the drive.
Re: Monitoring Engines keeps dying randomly
Posted: Wed Oct 27, 2021 11:30 am
by pbroste
Hello
@isadmin
Thanks for following up, please use the split command and [PM] them separately.
Code: Select all
split -b 45M profile.zip /tmp/systemprofile
Please send the 'systemprofile
incremented_alphabet file in separate PM.
I will look into the '/var/log/php-fpm/www-error.log' issue and follow up.
Thanks,
Perry
Re: Monitoring Engines keeps dying randomly
Posted: Wed Oct 27, 2021 11:59 am
by isadmin
Perry I have PM you all 3 files for the Profile.
Thanks
Re: Monitoring Engines keeps dying randomly
Posted: Wed Oct 27, 2021 3:09 pm
by pbroste
Hello
@isadmin
Thanks for sending over the System Profile, in review we see issues where Performance data is going into timeout and then "sig error".
Here is a
support article that walks you through the crucial points on how to optimize.
Change the "load_threshold" to 20 in the "/usr/local/nagios/etc/pnp/npcd.cfg" file:
and restart npcd:
I am not entirely clear why we see increased logging in the '/var/log/php-fpm/www-error.log.' Perhaps an admin turned (debug or verbose) logging on. Option to verify the configuration for the logging and either disable logging or implement logrotation on it. This will point you to the config location:
Code: Select all
grep -Eir 'www-error' /etc/httpd/ /etc/php* -A 2
If changes are made please bounce the httpd (apache) services.
Thanks,
Perry
Re: Monitoring Engines keeps dying randomly
Posted: Wed Oct 27, 2021 3:50 pm
by isadmin
Thanks Perry I made the changes and turned off logging in the cfg.
;php_admin_value[sendmail_path] = /usr/sbin/sendmail -t -i -f
[email protected]
;php_flag[display_errors] = off
;php_admin_value[error_log] = /var/log/php-fpm/www-error.log
;php_admin_flag[log_errors] = on
;php_admin_value[memory_limit] = 128M
We will monitor and see. It usually crashes every few days or so.
Re: Monitoring Engines keeps dying randomly
Posted: Thu Oct 28, 2021 8:26 am
by pbroste
Hello
@isadmin
Thanks for following, please let us know how things are looking in a couple of days.
Regards,
Perry