Page 1 of 2
Problem with graph generation
Posted: Fri Apr 13, 2018 9:07 am
by ctretelea
Hi all,
We are currently experiencing a problem with some Alerts Created in Nagios XI server. The main problem it's the perf data captured in the server it's not generating the graphs. The alerts are configured between Windows Server 2016 Client with Linux Nagios XI Server, the connection between the servers was reviewed and it's works fine, the script was executed manually using the Core Config Manager and reach the server and generates the data.
We also execute the Nagios Script for repair the databases and this not solve the problem, we need you help to find what could be the problem because that's happening now in our Nagios Production environment.
Thanks for any help
Re: Problem with graph generation
Posted: Fri Apr 13, 2018 10:35 am
by lmiltchev
Missing or blank graphs issues could be caused by many factors. Let's get some more information that can help us identify the issue.
- How did you create this service - is it created by a wizard or it is from a custom plugin/script?
- Have you changed the check command after you've created the service?
- Have you waited long enough for the RRD and XML files to get created (usually 15-20 min)?
- Do you see the RRD and XML file in "/usr/local/nagios/share/perfdata/<your host>/" directory?
- Have you checked to see if npcd is running? Maybe, it stopped because of a high load on the system?
- Have you increased the verbosity of the "/usr/local/nagios/var/npcd.log" and the "/usr/local/nagios/var/perfdata.log" and checked both logs for clues?
- Have you checked if performance data files are not piling up in the "/usr/local/nagios/var/spool/perfdata" and "/usr/local/nagios/var/spool/xidpe" directories?
- Have you tried to implement a ramdisk?
I would recommend going through all of the steps, outlined in our "Nagios XI - Performance Graph Problems" KB article here:
https://support.nagios.com/kb/article/n ... ems-9.html
In the majority of the case, this is enough to identify the issue, and take an action on it.
Hope this helps. Thank you!
Re: Problem with graph generation
Posted: Fri Apr 13, 2018 2:10 pm
by ctretelea
- How did you create this service - is it created by a wizard or it is from a custom plugin/script?
-by a wizard
- Have you changed the check command after you've created the service?
- no I chenged
- Have you waited long enough for the RRD and XML files to get created (usually 15-20 min)?
-yes 24 h
- Have you checked to see if npcd is running? Maybe, it stopped because of a high load on the system?
-yes is running
- Have you increased the verbosity of the "/usr/local/nagios/var/npcd.log" and the "/usr/local/nagios/var/perfdata.log" and checked both logs for clues?
-no
- Have you tried to implement a ramdisk?
-yes
- Have you checked if performance data files are not piling up in the "/usr/local/nagios/var/spool/perfdata" and "/usr/local/nagios/var/spool/xidpe" directories?
for /usr/local/nagios/var/spool/perfdata - I have 13000, from 19.02.2018
for /usr/local/nagios/var/spool/xidpe - just the last process
- Do you see the RRD and XML file in "/usr/local/nagios/share/perfdata/<your host>/" directory?
-yes, but just the ping
[04-13-2018 13:40:26] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1523641203.perfdata.service'
[04-13-2018 13:40:55] NPCD: Caught Termination Signal - Astalavista... baby
[04-13-2018 13:40:55] NPCD: npcd Daemon (0.6.25) started with PID=5134
[04-13-2018 13:40:55] NPCD: Please have a look at 'npcd -V' to get license information
[04-13-2018 13:40:55] NPCD: HINT: load_threshold is enabled - ('10.000000')
[04-13-2018 13:40:55] NPCD: ERROR: Executed command exits with return code '13'
[04-13-2018 13:40:55] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1523641248.perfdata.service'
[04-13-2018 13:45:14] NPCD: ERROR: Executed command exits with return code '13'
[04-13-2018 13:45:14] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1523641503.perfdata.host'
[04-13-2018 13:45:14] NPCD: ERROR: Executed command exits with return code '13'
[04-13-2018 13:45:14] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1523641504.perfdata.service'
[04-13-2018 13:46:00] NPCD: ERROR: Executed command exits with return code '13'
[04-13-2018 13:46:00] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1523641548.perfdata.service'
[04-13-2018 13:50:18] NPCD: ERROR: Executed command exits with return code '13'
[04-13-2018 13:50:18] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1523641803.perfdata.service'
[04-13-2018 13:50:18] NPCD: ERROR: Executed command exits with return code '13'
[04-13-2018 13:50:18] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1523641803.perfdata.host'
Code: Select all
DATATYPE::SERVICEPERFDATA TIMET::1523645709 HOSTNAME::******** SERVICEDESC::Drive C: Disk Usage SERVICEPERFDATA::'C:\ Used Space'=25.23Gb;40.00;47.50;0.00;50.00 SERVICECHECKCOMMAND::check_xi_service_nsclient!xxxxx!USEDDISKSPACE!-l C -w 80 -c 95 HOSTSTATE::UP HOSTSTATETYPE::HARD SERVICESTATE::OK SERVICESTATETYPE::HARD SERVICEOUTPUT::C:\ - total: 50.00 Gb - used: 25.23 Gb (50%) - free 24.77 Gb (50%) LONGSERVICEOUTPUT::
Re: Problem with graph generation
Posted: Fri Apr 13, 2018 3:40 pm
by lmiltchev
I have a similar check, but unlike yours, the 'C:\ Used Space' is not spread on two lines. I wonder if you have an extra carriage return, which produces invalid perfdata... Can you show us the actual check, run from the command line, along with the output of it?
Are you having issues with any other hosts/services or this is the only perfdata problem that you are having?
Just to rule out some other potential issues, run the following commands and show the output:
Code: Select all
grep ramdisk /usr/local/nagios/etc/nagios.cfg /usr/local/nrdp/server/config.inc.php /usr/local/nagiosxi/html/config.inc.php /usr/local/nagios/etc/pnp/npcd.cfg
ls /var/nagiosramdisk/spool/xidpe | wc -l
ls /var/nagiosramdisk/spool/perfdata/ | wc -l
ls /var/nagiosramdisk/spool/checkresults/ | wc -l
chage -l nagios
Also, show us the
process-host-perfdata-file-bulk and
process-service-perfdata-file-bulk command definitions.
Re: Problem with graph generation
Posted: Mon Apr 16, 2018 9:05 am
by ctretelea
This is the answer to you questions:
Are you having issues with any other hosts/services or this is the only perfdata problem that you are having?
-
just for one host
This is the results of the your commands:
grep:
grep ramdisk /usr/local/nagios/etc/nagios.cfg /usr/local/nrdp/server/config.inc.php /usr/local/nagiosxi/html/config.inc.php /usr/local/nagios/etc/pnp/npcd.cfg
/usr/local/nagios/etc/nagios.cfg:service_perfdata_file=/var/nagiosramdisk/service-perfdata
/usr/local/nagios/etc/nagios.cfg:host_perfdata_file=/var/nagiosramdisk/host-perfdata
/usr/local/nagios/etc/nagios.cfg:check_result_path=/var/nagiosramdisk/spool/checkresults
/usr/local/nagios/etc/nagios.cfg:object_cache_file=/var/nagiosramdisk/objects.cache
/usr/local/nagios/etc/nagios.cfg:status_file=/var/nagiosramdisk/status.dat
/usr/local/nagios/etc/nagios.cfg:temp_path=/var/nagiosramdisk/tmp
/usr/local/nrdp/server/config.inc.php:$cfg["check_results_dir"]="/var/nagiosramdisk/spool/checkresults";
/usr/local/nagiosxi/html/config.inc.php:$cfg['xidpe_dir'] = '/var/nagiosramdisk/spool/xidpe/';
/usr/local/nagiosxi/html/config.inc.php:$cfg['perfdata_spool'] = '/var/nagiosramdisk/spool/perfdata/';
/usr/local/nagios/etc/pnp/npcd.cfg:perfdata_spool_dir = /var/nagiosramdisk/spool/perfdata/
ls:
ls /var/nagiosramdisk/spool/xidpe | wc -l
0
ls /var/nagiosramdisk/spool/perfdata/ | wc -l
15847
ls /var/nagiosramdisk/spool/checkresults/ | wc -l
2
Chage:
chage -l nagios
Last password change : Feb 17, 2017
Password expires : never
Password inactive : never
Account expires : never
Minimum number of days between password change : 0
Maximum number of days between password change : 99999
Number of days of warning before password expires : 7
This is the commands definitions:
process-host-perfdata-file-bulk
define command {
command_name process-host-perfdata-file-bulk
command_line /bin/mv /var/nagiosramdisk/host-perfdata /var/nagiosramdisk/spool/xidpe/$TIMET$.perfdata.host
}
process-service-perfdata-file-bulk
define command {
command_name process-service-perfdata-file-bulk
command_line /bin/mv /var/nagiosramdisk/service-perfdata /var/nagiosramdisk/spool/xidpe/$TIMET$.perfdata.service
}
Re: Problem with graph generation
Posted: Mon Apr 16, 2018 9:40 am
by lmiltchev
The ramdisk directives look correct - I don't see any issues there. Can you show us the actual check, run from the command line, along with the output of it?
Re: Problem with graph generation
Posted: Mon Apr 16, 2018 10:15 am
by ctretelea
Hi lmiltchev,
here is the check command results:
Code: Select all
[[email protected] ~]$ /usr/local/nagios/libexec/check_nt -H 10.60.5.200 -s "xxxxx" -p 12489 -v USEDDISKSPACE -l C -w 80 -c 95
C:\ - total: 50.00 Gb - used: 25.23 Gb (50%) - free 24.77 Gb (50%) | 'C:\ Used Space'=25.23Gb;40.00;47.50;0.00;50.00
Re: Problem with graph generation
Posted: Mon Apr 16, 2018 11:20 am
by lmiltchev
The output of your check looks exactly like mine. We would need to get your profile, and try to recreate the issue in-house. Can you PM me (or anyone on the Nagios Support team) your profile (
profile.zip)?
Admin > System Profile > Download Profile
Also, PM me the
name of the host we are troubleshooting.
Do you have any
special characters in the hostname or/and the NSClient++ password?
Run the following commands, and show the output:
Code: Select all
uptime
service npcd status
ls -lad /usr/local/nagios/share/perfdata/<your host>
ls -la /usr/local/nagios/share/perfdata/<your host>
Re: Problem with graph generation
Posted: Mon Apr 16, 2018 3:36 pm
by lmiltchev
Let's fix the permissions of the files/directories under "perfdata". Run the following command in order to change the ownership to nagios.nagios (you have apache.apache to some of the folders and nagios cannot write to them).
Code: Select all
chown nagios:nagios /usr/local/nagios/share/perfdata/*
Let us know if this helped.
Re: Problem with graph generation
Posted: Tue Apr 17, 2018 8:09 am
by ctretelea
Hi,
you were right, that folder owner was apache user&group.
Now the case it's solved.
Thanks.