Issues on scheduled reports

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
lvaillant
Posts: 57
Joined: Mon Jun 06, 2016 2:47 am
Location: Paris, France

Issues on scheduled reports

Post by lvaillant »

Hello,

I am trying to schedule monthly reports for some of our ecosystems (as ERP, domain controlers, messaging servers...) to track availability.

When directly generating ("last month" period) reports with the Web GUI, reports display well after computing data a long time (around 6/8min).
During this time, browser is stuck, and server is quite long to answer to another requests... But it works.
When sending the reports by email, same behavior (browser stuck

Then I schedule the same reports to get them in the night each month in the 1st days.
I add separated hosts & services CSV files, but no PDF file.

The cron entries are created as expected and launched.
I also found the related calls to availability.php in Apache logs (HTTP code 200) at the scheduled time.

But some reports are incomplete.

The incomplete ones contain CSV files with this kind of lines :

Code: Select all

host,service,ok %,warning %,unknown %,critical %
    <p><pre>SQL Error [nagiosxi] : MySQL server has gone away</pre></p>
    <p><pre>SQL Error [nagiosxi] : MySQL server has gone away</pre></p>
    ...
(Same string in PDF files if generated)

MariaDB:
  • wait_timeout = 300
  • max_allowed_packet = 1048576

Code: Select all

-------- Performance Metrics -----------------------------------------------------------------------
[--] Up for: 58d 0h 42m 14s (20B q [4K qps], 11M conn, TX: 12399G, RX: 9365G)
[--] Reads / Writes: 4% / 96%
[--] Binary logging is disabled
[--] Physical Memory     : 62.3G
[--] Max MySQL memory    : 3.2G
[--] Other process memory: 660.6M
[--] Total buffers: 912.0M global + 4.7M per thread (500 max threads)
[--] P_S Max memory usage: 0B
[--] Galera GCache Max memory usage: 0B
[OK] Maximum reached memory usage: 2.5G (4.00% of installed RAM)
[OK] Maximum possible memory usage: 3.2G (5.08% of installed RAM)
[OK] Overall possible memory usage with other process is compatible with memory available
[OK] Slow queries: 0% (353/20B)
[OK] Highest usage of available connections: 70% (352/500)
[OK] Aborted connections: 0.19%  (21331/11344995)
[OK] Query cache is disabled by default due to mutex contention on multiprocessor machines.
[!!] Sorts requiring temporary tables: 23% (5M temp sorts / 24M sorts)
[!!] Joins performed without indexes: 676053
[OK] Temporary tables created on disk: 19% (3M on disk / 17M total)
[OK] Thread cache hit rate: 73% (3M created / 11M connections)
[!!] Table cache hit rate: 0% (309 open / 141K opened)
[OK] Open file limit used: 17% (436/2K)
[OK] Table locks acquired immediately: 99% (20B immediate / 20B locks)
I don't see any message about ndo2db issues in system logs.

Do you have any ideas to debug what happens ?
Thank you

Nagios XI 5.4.13 - 2251 Hosts, 13440 related services
Dedicated physical server / 2x20 Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz / 64GB RAM
RHEL 7.4 / MariaDB 5.5.56-2 /Apache HTTPd 2.4.6-67
Loïc VAILLANT
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: Issues on scheduled reports

Post by cdienger »

Have the values in php.ini been increased per https://support.nagios.com/kb/article/n ... e-611.html ? If not, follow the recommendations in the article to do so. If have already been increased try doubling the current values and test again.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
lvaillant
Posts: 57
Joined: Mon Jun 06, 2016 2:47 am
Location: Paris, France

Re: Issues on scheduled reports

Post by lvaillant »

Hello.

Here hare the current settings in /etc/php/ini:

Code: Select all

max_input_time = 60
max_execution_time = 90
memory_limit = 1024M
; max_input_vars = 1000
The main concern is that the reports are fully available when computing them "live". But it does not work when scheduled.
I see no explicite error messages in nagios log files.

As the reports are generated in more than 5 minutes, does it mean I have to set up max_execution_time > 300? (~600)

As said in the official PHP documentation, max_execution_time=0 when running PHP from the cmd line (crontab).

Thank you.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Issues on scheduled reports

Post by scottwilkerson »

lvaillant wrote:As the reports are generated in more than 5 minutes, does it mean I have to set up max_execution_time > 300? (~600)
This is likely correct
lvaillant wrote:As said in the official PHP documentation, max_execution_time=0 when running PHP from the cmd line (crontab).
This would be true, except when scheduled reporting happens, it actually loads the page through apache, and then converts it into a pdf in a bit of magic. So while it is a php cron that is starting the process, that cron is infact just loading a URL just as if it was done through the UI
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
lvaillant
Posts: 57
Joined: Mon Jun 06, 2016 2:47 am
Location: Paris, France

Re: Issues on scheduled reports

Post by lvaillant »

Hi,

Thank you for the details.
I will try this solution early next week and keep you informed.

But I don't understand why the same reports generated directly in Apache is ok.
Loïc VAILLANT
lvaillant
Posts: 57
Joined: Mon Jun 06, 2016 2:47 am
Location: Paris, France

Re: Issues on scheduled reports

Post by lvaillant »

Tested at noon...

Code: Select all

[root@hq-nagios-xi01 ~]# grep max_exec /etc/php.ini
;max_execution_time = 30
;max_execution_time = 90
max_execution_time = 600
HTTPd daemon has been restarted.

Nothing changed.

The Host_Availability.csv file is ok, but the Service_Availability.csv file does not contain the expected data:

Code: Select all

host,service,ok %,warning %,unknown %,critical %
    <p><pre>SQL Error [nagiosxi] : MySQL server has gone away</pre></p>
    <p><pre>SQL Error [nagiosxi] : MySQL server has gone away</pre></p>
    <p><pre>SQL Error [nagiosxi] : MySQL server has gone away</pre></p>
...
    <p><pre>SQL Error [nagiosxi] : MySQL server has gone away</pre></p>
,"AVERAGE",0,0,0,0
Loïc VAILLANT
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Issues on scheduled reports

Post by scottwilkerson »

Thanks, let us know how it turns out
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
lvaillant
Posts: 57
Joined: Mon Jun 06, 2016 2:47 am
Location: Paris, France

Re: Issues on scheduled reports

Post by lvaillant »

Cf. previous message :?
Still the same issue.
Loïc VAILLANT
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Issues on scheduled reports

Post by scottwilkerson »

Code: Select all

MySQL server has gone away
This is strange, do you have a limit on mysql connections times, or did someone restart the mysql service while you were doing the test?
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
lvaillant
Posts: 57
Joined: Mon Jun 06, 2016 2:47 am
Location: Paris, France

Re: Issues on scheduled reports

Post by lvaillant »

RHEL 7.4 / MariaDB 5.5.56-2 /Apache HTTPd 2.4.6-67

MariaDB:
  • wait_timeout = 300
  • connect_timeout = 10
  • max_allowed_packet = 1048576

Code: Select all

-------- Performance Metrics -----------------------------------------------------------------------
[--] Up for: 67d 2h 28m 12s (24B q [4K qps], 13M conn, TX: 14482G, RX: 10925G)
[--] Reads / Writes: 4% / 96%
[--] Binary logging is disabled
[--] Physical Memory     : 62.3G
[--] Max MySQL memory    : 3.2G
[--] Other process memory: 680.3M
[--] Total buffers: 912.0M global + 4.7M per thread (500 max threads)
[--] P_S Max memory usage: 0B
[--] Galera GCache Max memory usage: 0B
[OK] Maximum reached memory usage: 2.5G (4.00% of installed RAM)
[OK] Maximum possible memory usage: 3.2G (5.08% of installed RAM)
[OK] Overall possible memory usage with other process is compatible with memory available
[OK] Slow queries: 0% (450/24B)
[OK] Highest usage of available connections: 70% (352/500)
[OK] Aborted connections: 0.16%  (21331/13157342)
[OK] Query cache is disabled by default due to mutex contention on multiprocessor machines.
[!!] Sorts requiring temporary tables: 23% (6M temp sorts / 29M sorts)
[!!] Joins performed without indexes: 1126419
[OK] Temporary tables created on disk: 19% (3M on disk / 19M total)
[OK] Thread cache hit rate: 73% (3M created / 13M connections)
[!!] Table cache hit rate: 0% (159 open / 169K opened)
[OK] Open file limit used: 7% (191/2K)
[OK] Table locks acquired immediately: 99% (23B immediate / 23B locks)

But once again, it works when done directly in the web GUI.
And one of the reports was generated in only 1min and 11s (scheduled at 01:30PM, email sent at 01:36:11PM).
Loïc VAILLANT
Locked