Performance graph gaps
Posted: Fri Oct 09, 2020 10:27 am
Hello,
We are running into some issues with gaps our performance graphs.
All services are showing green but for hours at a time, chunks of data are missing from our performance graphs, typically off regular working hours. We have followed the instructions found in the documentation and made a few changes but still have not pinned down what is going on.
We took the following actions:
Here are the spooled files count-- it doesn't meet the 20k number cited in the article.
From perfdata.log, logging stops being written to it exactly when the missing data starts on the GUI.
From npcd.log, we are seeing the following for every check:
Additionally, we saw some of the following errors in messages.log:
However, we have confirmed that we did not stress the memory on the ESXi. The VM has 4 CPU and 8 GB RAM for reference.
Any ideas where we should be looking to resolve?
Thanks,
We are running into some issues with gaps our performance graphs.
All services are showing green but for hours at a time, chunks of data are missing from our performance graphs, typically off regular working hours. We have followed the instructions found in the documentation and made a few changes but still have not pinned down what is going on.
We took the following actions:
- 1. Upped the verbosity of both NPCD and perfdata
2. Confirmed the nagios account has not expired
3. We noted errors re: load threshhold and adjusted the load_threshold of NPCD to 20 and restarted NPCD.
Here are the spooled files count-- it doesn't meet the 20k number cited in the article.
Code: Select all
$ ls /usr/local/nagios/var/spool/perfdata/ | wc -l
2
$ ls /usr/local/nagios/var/spool/xidpe/ | wc -l
4707
From npcd.log, we are seeing the following for every check:
Code: Select all
[10-09-2020 11:17:32] NPCD: ThreadCounter 0/5 File is 1599774829.perfdata.service-PID-15586
[10-09-2020 11:17:32] NPCD: File '1599774829.perfdata.service-PID-15586' is an already in process PNP file. Leaving it untouched.
[10-09-2020 11:17:32] NPCD: DEBUG: load 1.970000/20.000000
[10-09-2020 11:17:32] NPCD: ThreadCounter 0/5 File is 1600195788.perfdata.host-PID-20283
[10-09-2020 11:17:32] NPCD: File '1600195788.perfdata.host-PID-20283' is an already in process PNP file. Leaving it untouched.
Code: Select all
Oct 9 06:36:57 dltfanxi1 nagios: Warning: fork() in my_system_r() failed for command "/bin/mv /usr/local/nagios/var/host-perfdata /usr/local/nagios/var/spool/xidpe/1602239817.perfdata.host" - errno: Cannot allocate memory
Oct 9 06:37:11 dltfanxi1 nagios: Warning: fork() in my_system_r() failed for command "/bin/mv /usr/local/nagios/var/service-perfdata /usr/local/nagios/var/spool/xidpe/1602239831.perfdata.service" - errno: Cannot allocate memory
Oct 9 06:37:12 dltfanxi1 nagios: Warning: fork() in my_system_r() failed for command "/bin/mv /usr/local/nagios/var/host-perfdata /usr/local/nagios/var/spool/xidpe/1602239831.perfdata.host" - errno: Cannot allocate memory
Oct 9 06:37:27 dltfanxi1 nagios: Warning: fork() in my_system_r() failed for command "/bin/mv /usr/local/nagios/var/service-perfdata /usr/local/nagios/var/spool/xidpe/1602239847.perfdata.service" - errno: Cannot allocate memory
Oct 9 06:37:27 dltfanxi1 nagios: Warning: fork() in my_system_r() failed for command "/bin/mv /usr/local/nagios/var/host-perfdata /usr/local/nagios/var/spool/xidpe/1602239847.perfdata.host" - errno: Cannot allocate memory
However, we have confirmed that we did not stress the memory on the ESXi. The VM has 4 CPU and 8 GB RAM for reference.
Any ideas where we should be looking to resolve?
Thanks,