Page 1 of 2

Issues on scheduled reports

Posted: Wed May 23, 2018 7:06 am
by lvaillant
Hello,

I am trying to schedule monthly reports for some of our ecosystems (as ERP, domain controlers, messaging servers...) to track availability.

When directly generating ("last month" period) reports with the Web GUI, reports display well after computing data a long time (around 6/8min).
During this time, browser is stuck, and server is quite long to answer to another requests... But it works.
When sending the reports by email, same behavior (browser stuck

Then I schedule the same reports to get them in the night each month in the 1st days.
I add separated hosts & services CSV files, but no PDF file.

The cron entries are created as expected and launched.
I also found the related calls to availability.php in Apache logs (HTTP code 200) at the scheduled time.

But some reports are incomplete.

The incomplete ones contain CSV files with this kind of lines :

Code: Select all

host,service,ok %,warning %,unknown %,critical %
    <p><pre>SQL Error [nagiosxi] : MySQL server has gone away</pre></p>
    <p><pre>SQL Error [nagiosxi] : MySQL server has gone away</pre></p>
    ...
(Same string in PDF files if generated)

MariaDB:
  • wait_timeout = 300
  • max_allowed_packet = 1048576

Code: Select all

-------- Performance Metrics -----------------------------------------------------------------------
[--] Up for: 58d 0h 42m 14s (20B q [4K qps], 11M conn, TX: 12399G, RX: 9365G)
[--] Reads / Writes: 4% / 96%
[--] Binary logging is disabled
[--] Physical Memory     : 62.3G
[--] Max MySQL memory    : 3.2G
[--] Other process memory: 660.6M
[--] Total buffers: 912.0M global + 4.7M per thread (500 max threads)
[--] P_S Max memory usage: 0B
[--] Galera GCache Max memory usage: 0B
[OK] Maximum reached memory usage: 2.5G (4.00% of installed RAM)
[OK] Maximum possible memory usage: 3.2G (5.08% of installed RAM)
[OK] Overall possible memory usage with other process is compatible with memory available
[OK] Slow queries: 0% (353/20B)
[OK] Highest usage of available connections: 70% (352/500)
[OK] Aborted connections: 0.19%  (21331/11344995)
[OK] Query cache is disabled by default due to mutex contention on multiprocessor machines.
[!!] Sorts requiring temporary tables: 23% (5M temp sorts / 24M sorts)
[!!] Joins performed without indexes: 676053
[OK] Temporary tables created on disk: 19% (3M on disk / 17M total)
[OK] Thread cache hit rate: 73% (3M created / 11M connections)
[!!] Table cache hit rate: 0% (309 open / 141K opened)
[OK] Open file limit used: 17% (436/2K)
[OK] Table locks acquired immediately: 99% (20B immediate / 20B locks)
I don't see any message about ndo2db issues in system logs.

Do you have any ideas to debug what happens ?
Thank you

Nagios XI 5.4.13 - 2251 Hosts, 13440 related services
Dedicated physical server / 2x20 Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz / 64GB RAM
RHEL 7.4 / MariaDB 5.5.56-2 /Apache HTTPd 2.4.6-67

Re: Issues on scheduled reports

Posted: Wed May 23, 2018 2:31 pm
by cdienger
Have the values in php.ini been increased per https://support.nagios.com/kb/article/n ... e-611.html ? If not, follow the recommendations in the article to do so. If have already been increased try doubling the current values and test again.

Re: Issues on scheduled reports

Posted: Mon May 28, 2018 2:59 am
by lvaillant
Hello.

Here hare the current settings in /etc/php/ini:

Code: Select all

max_input_time = 60
max_execution_time = 90
memory_limit = 1024M
; max_input_vars = 1000
The main concern is that the reports are fully available when computing them "live". But it does not work when scheduled.
I see no explicite error messages in nagios log files.

As the reports are generated in more than 5 minutes, does it mean I have to set up max_execution_time > 300? (~600)

As said in the official PHP documentation, max_execution_time=0 when running PHP from the cmd line (crontab).

Thank you.

Re: Issues on scheduled reports

Posted: Tue May 29, 2018 8:35 am
by scottwilkerson
lvaillant wrote:As the reports are generated in more than 5 minutes, does it mean I have to set up max_execution_time > 300? (~600)
This is likely correct
lvaillant wrote:As said in the official PHP documentation, max_execution_time=0 when running PHP from the cmd line (crontab).
This would be true, except when scheduled reporting happens, it actually loads the page through apache, and then converts it into a pdf in a bit of magic. So while it is a php cron that is starting the process, that cron is infact just loading a URL just as if it was done through the UI

Re: Issues on scheduled reports

Posted: Fri Jun 01, 2018 3:56 am
by lvaillant
Hi,

Thank you for the details.
I will try this solution early next week and keep you informed.

But I don't understand why the same reports generated directly in Apache is ok.

Re: Issues on scheduled reports

Posted: Fri Jun 01, 2018 6:56 am
by lvaillant
Tested at noon...

Code: Select all

[root@hq-nagios-xi01 ~]# grep max_exec /etc/php.ini
;max_execution_time = 30
;max_execution_time = 90
max_execution_time = 600
HTTPd daemon has been restarted.

Nothing changed.

The Host_Availability.csv file is ok, but the Service_Availability.csv file does not contain the expected data:

Code: Select all

host,service,ok %,warning %,unknown %,critical %
    <p><pre>SQL Error [nagiosxi] : MySQL server has gone away</pre></p>
    <p><pre>SQL Error [nagiosxi] : MySQL server has gone away</pre></p>
    <p><pre>SQL Error [nagiosxi] : MySQL server has gone away</pre></p>
...
    <p><pre>SQL Error [nagiosxi] : MySQL server has gone away</pre></p>
,"AVERAGE",0,0,0,0

Re: Issues on scheduled reports

Posted: Fri Jun 01, 2018 6:58 am
by scottwilkerson
Thanks, let us know how it turns out

Re: Issues on scheduled reports

Posted: Fri Jun 01, 2018 7:13 am
by lvaillant
Cf. previous message :?
Still the same issue.

Re: Issues on scheduled reports

Posted: Fri Jun 01, 2018 7:15 am
by scottwilkerson

Code: Select all

MySQL server has gone away
This is strange, do you have a limit on mysql connections times, or did someone restart the mysql service while you were doing the test?

Re: Issues on scheduled reports

Posted: Fri Jun 01, 2018 7:22 am
by lvaillant
RHEL 7.4 / MariaDB 5.5.56-2 /Apache HTTPd 2.4.6-67

MariaDB:
  • wait_timeout = 300
  • connect_timeout = 10
  • max_allowed_packet = 1048576

Code: Select all

-------- Performance Metrics -----------------------------------------------------------------------
[--] Up for: 67d 2h 28m 12s (24B q [4K qps], 13M conn, TX: 14482G, RX: 10925G)
[--] Reads / Writes: 4% / 96%
[--] Binary logging is disabled
[--] Physical Memory     : 62.3G
[--] Max MySQL memory    : 3.2G
[--] Other process memory: 680.3M
[--] Total buffers: 912.0M global + 4.7M per thread (500 max threads)
[--] P_S Max memory usage: 0B
[--] Galera GCache Max memory usage: 0B
[OK] Maximum reached memory usage: 2.5G (4.00% of installed RAM)
[OK] Maximum possible memory usage: 3.2G (5.08% of installed RAM)
[OK] Overall possible memory usage with other process is compatible with memory available
[OK] Slow queries: 0% (450/24B)
[OK] Highest usage of available connections: 70% (352/500)
[OK] Aborted connections: 0.16%  (21331/13157342)
[OK] Query cache is disabled by default due to mutex contention on multiprocessor machines.
[!!] Sorts requiring temporary tables: 23% (6M temp sorts / 29M sorts)
[!!] Joins performed without indexes: 1126419
[OK] Temporary tables created on disk: 19% (3M on disk / 19M total)
[OK] Thread cache hit rate: 73% (3M created / 13M connections)
[!!] Table cache hit rate: 0% (159 open / 169K opened)
[OK] Open file limit used: 7% (191/2K)
[OK] Table locks acquired immediately: 99% (23B immediate / 23B locks)

But once again, it works when done directly in the web GUI.
And one of the reports was generated in only 1min and 11s (scheduled at 01:30PM, email sent at 01:36:11PM).