Page 1 of 1

Graphs are not showing

Posted: Mon Feb 13, 2017 4:22 am
by redesgtt
Our graphs are not showing since last night.

I think all service system are active:

Code: Select all

[root@nagiosxi var]# service npcd status
NPCD running (pid 20934).

Code: Select all

[root@nagiosxi var]# service ndo2db status
ndo2db (pid 24797) is running...

Code: Select all

[root@nagiosxi var]# service nagios status
nagios (pid 21301) is running...

Logs- Debugs:
[root@nagiosxi var]# tail -25 npcd.log

Code: Select all

[02-13-2017 10:02:52] NPCD: WARN: MAX load reached: load 31.720000/10.000000 at i=1
[02-13-2017 10:03:07] NPCD: DEBUG: load 32.060000/10.000000
[02-13-2017 10:03:07] NPCD: WARN: MAX load reached: load 32.060000/10.000000 at i=1
[02-13-2017 10:03:22] NPCD: DEBUG: load 30.800000/10.000000
[02-13-2017 10:03:22] NPCD: WARN: MAX load reached: load 30.800000/10.000000 at i=1
[02-13-2017 10:03:37] NPCD: DEBUG: load 29.720000/10.000000
[02-13-2017 10:03:37] NPCD: WARN: MAX load reached: load 29.720000/10.000000 at i=1
[02-13-2017 10:03:52] NPCD: DEBUG: load 26.340000/10.000000
[02-13-2017 10:03:52] NPCD: WARN: MAX load reached: load 26.340000/10.000000 at i=1
[02-13-2017 10:04:07] NPCD: DEBUG: load 24.860000/10.000000
[02-13-2017 10:04:07] NPCD: WARN: MAX load reached: load 24.860000/10.000000 at i=1
[02-13-2017 10:04:22] NPCD: DEBUG: load 23.830000/10.000000
[02-13-2017 10:04:22] NPCD: WARN: MAX load reached: load 23.830000/10.000000 at i=1
[02-13-2017 10:04:37] NPCD: DEBUG: load 22.650000/10.000000
[02-13-2017 10:04:37] NPCD: WARN: MAX load reached: load 22.650000/10.000000 at i=1
[02-13-2017 10:04:52] NPCD: DEBUG: load 21.930000/10.000000
[02-13-2017 10:04:52] NPCD: WARN: MAX load reached: load 21.930000/10.000000 at i=1
[02-13-2017 10:05:07] NPCD: DEBUG: load 22.210000/10.000000
[02-13-2017 10:05:07] NPCD: WARN: MAX load reached: load 22.210000/10.000000 at i=1
[02-13-2017 10:05:22] NPCD: DEBUG: load 21.670000/10.000000
service npcd restart:

Code: Select all

NPCD Stopped.
DEBUG: Config File = /usr/local/nagios/etc/pnp/npcd.cfg
CONFIG_OPT_LOGTYPE = file
CONFIG_OPT_LOGFILE = /usr/local/nagios/var/npcd.log
CONFIG_OPT_LOGFILESIZE = 10485760
CONFIG_OPT_LOGLEVEL = -1
CONFIG_OPT_SCANDIR = /var/nagiosramdisk/spool/perfdata/
CONFIG_OPT_RUNCMD = /usr/local/nagios/libexec/process_perfdata.pl
CONFIG_OPT_RUNCMD_ARG = -b
CONFIG_OPT_MAXTHREADS = 5
CONFIG_OPT_LOAD = 10.0
CONFIG_OPT_USER = nagios
CONFIG_OPT_GROUP = nagios
CONFIG_OPT_PIDFILE = /usr/local/nagiosxi/var/subsys/npcd.pid
CONFIG_OPT_SLEEPTIME = 15
CONFIG_OPT_IDENTMYSELF = (null)
---------------------------
DEBUG: load_threshold is enabled - ('10.000000')
NPCD started.
the problem:
But perfdata.log was stopped last night at 00:18:

Code: Select all

root@nagiosxi var]# tail -25 perfdata.log 
2017-02-13 00:18:03 [31747] [2] Template is check_bw_snmp.php
2017-02-13 00:18:03 [31747] [2] No Custom Template found for check_bw_snmp (/usr/local/nagios/etc/pnp/check_commands/check_bw_snmp.cfg) 
2017-02-13 00:18:03 [31747] [2] Template is check_bw_snmp.php
2017-02-13 00:18:03 [31747] [2] No Custom Template found for check_bw_snmp (/usr/local/nagios/etc/pnp/check_commands/check_bw_snmp.cfg) 
2017-02-13 00:18:03 [31747] [2] Template is check_bw_snmp.php
2017-02-13 00:18:03 [31747] [2] No Custom Template found for check_bw_snmp (/usr/local/nagios/etc/pnp/check_commands/check_bw_snmp.cfg) 
2017-02-13 00:18:03 [31747] [2] Template is check_bw_snmp.php
2017-02-13 00:18:03 [31747] [2] No Custom Template found for check_bw_snmp (/usr/local/nagios/etc/pnp/check_commands/check_bw_snmp.cfg) 
2017-02-13 00:18:03 [31747] [2] Template is check_bw_snmp.php
2017-02-13 00:18:03 [31747] [2] data2rrd called
2017-02-13 00:18:03 [31747] [2] RRDs::update /usr/local/nagios/share/perfdata/raytolaspalmas/Ancho_de_banda_PRINCIPAL.rrd 1486940941:0.00:0.00:0.8:0.96:2549831499:3216412633
2017-02-13 00:18:03 [31747] [2] /usr/local/nagios/share/perfdata/raytolaspalmas/Ancho_de_banda_PRINCIPAL.rrd updated
2017-02-13 00:18:03 [31747] [2] Processing Line 587
2017-02-13 00:18:03 [31747] [2] Datatype set to 'SERVICEPERFDATA' 
2017-02-13 00:18:03 [31747] [1] Found Performance Data for sai1Alicante / Temperatura_SAI (TEMP=19;29;30) 
2017-02-13 00:18:03 [31747] [2] No Custom Template found for check_snmp_sensorSAI (/usr/local/nagios/etc/pnp/check_commands/check_snmp_sensorSAI.cfg) 
2017-02-13 00:18:03 [31747] [2] Template is check_snmp_sensorSAI.php
2017-02-13 00:18:03 [31747] [2] data2rrd called
2017-02-13 00:18:03 [31747] [2] RRDs::update /usr/local/nagios/share/perfdata/sai1Alicante/Temperatura_SAI.rrd 1486940941:19
2017-02-13 00:18:04 [31747] [2] /usr/local/nagios/share/perfdata/sai1Alicante/Temperatura_SAI.rrd updated
2017-02-13 00:18:04 [31747] [2] Processing Line 588
2017-02-13 00:18:04 [31747] [2] No Perfdata. Skipping line 588
2017-02-13 00:18:04 [31747] [1] 588 lines processed
2017-02-13 00:18:04 [31747] [1] /var/nagiosramdisk/spool/perfdata//1486940941.perfdata.service-PID-31747 deleted
2017-02-13 00:18:04 [31747] [1] PNP exiting (runtime 15.685863s) ...
of course, the log is active, with 2 level:
[root@nagiosxi xidpe]# cat /usr/local/nagios/etc/pnp/process_perfdata.cfg

Code: Select all

#
# Config File for process_perfdata.pl
#
# $Id: process_perfdata.cfg-sample.in 520 2008-09-16 12:50:10Z pitchfork $
#
# process_perfdata.pl Timout 
#
TIMEOUT = 20
#
# Use RRDs Perl Module
#
USE_RRDs = 1 
#
# 
#
RRDPATH = /usr/local/nagios/share/perfdata
#
#
#
RRDTOOL = /usr/bin/rrdtool
#
#
#
CFG_DIR = /usr/local/nagios/etc/pnp
#
#
#
RRD_HEARTBEAT = 8460 
#
#
#
RRA_CFG = /usr/local/nagios/etc/pnp/rra.cfg
#
#
#
RRA_STEP = 60
#
#
#
LOG_FILE = /usr/local/nagios/var/perfdata.log
#
# Loglevel 0=silent 1=normal 2=debug
#
LOG_LEVEL = 2
#
# XML encoding
# The supported encodings are ISO-8859-1, UTF-8 and US-ASCII.
# http://www.php.net/xml-parser-create
XML_ENC = UTF-8
#
# EXPERIMENTAL rrdcached Support
# Use only with rrdtool svn revision 1511+
#
# RRD_DAEMON_OPTS = unix:/tmp/rrdcached.sock
And all graphs has been stopped last nigth at 00:00 more or less:
Image

but in service status detail, Performance Data is getting datas
Image

this is our settings and paths about PNP:
nagios.cfg

Code: Select all

# PNP settings - bulk mode with NCPD
process_performance_data=1
# service performance data
service_perfdata_file=/var/nagiosramdisk/service-perfdata
service_perfdata_file_template=DATATYPE::SERVICEPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tSERVICEDESC::$SERVICEDESC$\tSERVICEPERFDATA::$SERVICEPERFDATA$\tSERVICECHECKCOMMAND:
:$SERVICECHECKCOMMAND$\tHOSTSTATE::$HOSTSTATE$\tHOSTSTATETYPE::$HOSTSTATETYPE$\tSERVICESTATE::$SERVICESTATE$\tSERVICESTATETYPE::$SERVICESTATETYPE$\tSERVICEOUTPUT::$SERVICEOUTPUT$
service_perfdata_file_mode=a
service_perfdata_file_processing_interval=15
service_perfdata_file_processing_command=process-service-perfdata-file-bulk
# host performance data
host_perfdata_file=/var/nagiosramdisk/host-perfdata
host_perfdata_file_template=DATATYPE::HOSTPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tHOSTPERFDATA::$HOSTPERFDATA$\tHOSTCHECKCOMMAND::$HOSTCHECKCOMMAND$\tHOSTSTATE::$HOSTSTATE$
\tHOSTSTATETYPE::$HOSTSTATETYPE$\tHOSTOUTPUT::$HOSTOUTPUT$
host_perfdata_file_mode=a
host_perfdata_file_processing_interval=15
host_perfdata_file_processing_command=process-host-perfdata-file-bulk
commands.cfg

Code: Select all

define command {
       command_name                             process-service-perfdata-file-bulk
       command_line                             /bin/mv /var/nagiosramdisk/service-perfdata /var/nagiosramdisk/spool/xidpe/$TIMET$.perfdata.service
}
define command {
       command_name                             process-host-perfdata-file-bulk
       command_line                             /bin/mv /var/nagiosramdisk/host-perfdata /var/nagiosramdisk/spool/xidpe/$TIMET$.perfdata.host
}
And these are running, because timet 1486977438 is
GMT/UTC: Mon, 13 Feb 2017 09:17:18 GMT
En su tiempo: 13/2/2017 10:17:18
(it is ok)

[root@nagiosxi xidpe]# tail 25 /var/nagiosramdisk/service-perfdata

Code: Select all

==> /var/nagiosramdisk/service-perfdata <==
DATATYPE::SERVICEPERFDATA       TIMET::1486977438       HOSTNAME::raytoavila    SERVICEDESC::Ancho_de_banda     SERVICEPERFDATA::inUsage=0.13%;80;90 outUsage=1.40%;80;90 inBandwidth=10.08Kbs outBandwidth=10.88Kbs inAbsolut=2679554260 outAbsolut=4069659442     SERVICECHECKCOMMAND::check_bw_snmp!80!90!!!!!!  HOSTSTATE::UP   HOSTSTATETYPE::HARD     SERVICESTATE::OK    SERVICESTATETYPE::HARD  SERVICEOUTPUT::Average IN: 10.08Kbs (0.13%), Average OUT: 10.88Kbs (1.40%)<br>Total RX: 20443.38 Mbits, Total TX: 31049.04 Mbits
DATATYPE::SERVICEPERFDATA       TIMET::1486977439       HOSTNAME::swaytosagunto SERVICEDESC::Uso_de_Memoria_Cisco       SERVICEPERFDATA::Memoria=60;75;90       SERVICECHECKCOMMAND::check_mem_snmp.router!75!90!1!2!!!!    HOSTSTATE::UP   HOSTSTATETYPE::SOFT     SERVICESTATE::OK        SERVICESTATETYPE::HARD  SERVICEOUTPUT::OK (w75/c90) - usado 60%
DATATYPE::SERVICEPERFDATA       TIMET::1486977439       HOSTNAME::sSAGUNTO      SERVICEDESC::NSClient_Windows_TotalConnections  SERVICEPERFDATA::'connections'=17;60;80;        SERVICECHECKCOMMAND::check_nsclient_CurrentConnections!60!80!_Total!!!!!    HOSTSTATE::UP   HOSTSTATETYPE::SOFT     SERVICESTATE::OK        SERVICESTATETYPE::HARD  SERVICEOUTPUT::OK (w60/c80) - connections: 17
DATATYPE::SERVICEPERFDATA       TIMET::1486977439       HOSTNAME::SCBDPREPASE1  SERVICEDESC::NSClient_Windows_Disk_H    SERVICEPERFDATA::'DISKH:'=9;60;90;      SERVICECHECKCOMMAND::check_nsclient_disk!H!60!90!!!!!       HOSTSTATE::UP   HOSTSTATETYPE::HARD     SERVICESTATE::OK        SERVICESTATETYPE::HARD  SERVICEOUTPUT::OK (w60/c90) - H: 586 MB (9%)
DATATYPE::SERVICEPERFDATA       TIMET::1486977439       HOSTNAME::bdapat2       SERVICEDESC::NSClient_Windows_MEM       SERVICEPERFDATA::'MEM'=7;90;95;0;100    SERVICECHECKCOMMAND::check_nsclient_mem!90!95!!!!!! HOSTSTATE::UP   HOSTSTATETYPE::HARD     SERVICESTATE::OK        SERVICESTATETYPE::HARD  SERVICEOUTPUT::OK (w90/c95) - usado: 2539 MB (7%)
DATATYPE::SERVICEPERFDATA       TIMET::1486977439       HOSTNAME::SBURGOS       SERVICEDESC::NSClient_Windows_CPU       SERVICEPERFDATA::'CPU'=1;90;95; SERVICECHECKCOMMAND::check_nsclient_cpu_mcores!90!95!!!!!!  HOSTSTATE::UP   HOSTSTATETYPE::HARD     SERVICESTATE::OK        SERVICESTATETYPE::HARD  SERVICEOUTPUT::OK (w90/c95) - usado 1%
DATATYPE::SERVICEPERFDATA       TIMET::1486977439       HOSTNAME::rcallalicante SERVICEDESC::Latencia_ISP_PRINCIPAL     SERVICEPERFDATA::'RTT'=8;100;300;       SERVICECHECKCOMMAND::check_isp!100!300!5!2!!!!      HOSTSTATE::UP   HOSTSTATETYPE::HARD     SERVICESTATE::OK        SERVICESTATETYPE::HARD  SERVICEOUTPUT::OK - 172.29.0.57 (w100/c300) - 8ms
DATATYPE::SERVICEPERFDATA       TIMET::1486977439       HOSTNAME::INETNIW1      SERVICEDESC::NSClient_Windows_CurrentConnections_EXPLOTACION_   SERVICEPERFDATA::'connections'=0;60;80;     SERVICECHECKCOMMAND::check_nsclient_CurrentConnections!60!80!EXPLOTACION!!!!!   HOSTSTATE::UP   HOSTSTATETYPE::HARD     SERVICESTATE::OK        SERVICESTATETYPE::HARD  SERVICEOUTPUT::OK (w60/c80) - connections: 0
DATATYPE::SERVICEPERFDATA       TIMET::1486977439       HOSTNAME::SIT01 SERVICEDESC::NSClient_Windows_CurrentConnections_EXPLOTACION_   SERVICEPERFDATA::'connections'=3;60;80; SERVICECHECKCOMMAND::check_nsclient_CurrentConnections!60!80!EXPLOTACION!!!!!       HOSTSTATE::UP   HOSTSTATETYPE::HARD     SERVICESTATE::OK        SERVICESTATETYPE::HARD  SERVICEOUTPUT::OK (w60/c80) - connections: 3
DATATYPE::SERVICEPERFDATA       TIMET::1486977439       HOSTNAME::rapat1        SERVICEDESC::Ancho_de_banda     SERVICEPERFDATA::inUsage=0.03%;80;90 outUsage=0.03%;80;90 inBandwidth=29.92Kbs outBandwidth=33.52Kbs inAbsolut=1055154082 outAbsolut=3498153070     SERVICECHECKCOMMAND::check_bw_snmp!80!90!!!!!!  HOSTSTATE::UP   HOSTSTATETYPE::HARD     SERVICESTATE::OK    SERVICESTATETYPE::HARD  SERVICEOUTPUT::Average IN: 29.92Kbs (0.03%), Average OUT: 33.52Kbs (0.03%)<br>Total RX: 8050.19 Mbits, Total TX: 26688.79 Mbits
I have tried to reset all components (nagios core, performance grapher and database backend):
Image

Tail -f nagios.log while I do "Apply Configuration"

Code: Select all

[1486983766] Caught SIGTERM, shutting down...
[1486983768] Successfully shutdown... (PID=21301)
[1486983768] Event broker module 'NERD' deinitialized successfully.
[1486983769] livestatus: Socket thread has terminated
[1486983769] Event broker module '/usr/local/lib/mk-livestatus/livestatus.o' deinitialized successfully.
[1486983769] ndomod: Shutdown complete.
[1486983769] Event broker module '/usr/local/nagios/bin/ndomod.o' deinitialized successfully.
[1486983771] Nagios 4.2.4 starting... (PID=15049)
[1486983771] Local time is Mon Feb 13 12:02:51 CET 2017
[1486983771] LOG VERSION: 2.0
[1486983771] qh: Socket '/usr/local/nagios/var/rw/nagios.qh' successfully initialized
[1486983771] qh: core query handler registered
[1486983771] nerd: Channel hostchecks registered successfully
[1486983771] nerd: Channel servicechecks registered successfully
[1486983771] nerd: Channel opathchecks registered successfully
[1486983771] nerd: Fully initialized and ready to rock!
[1486983771] wproc: Successfully registered manager as @wproc with query handler
[1486983771] wproc: Registry request: name=Core Worker 15057;pid=15057
[1486983771] wproc: Registry request: name=Core Worker 15054;pid=15054
[1486983771] wproc: Registry request: name=Core Worker 15056;pid=15056
[1486983771] wproc: Registry request: name=Core Worker 15052;pid=15052
[1486983771] wproc: Registry request: name=Core Worker 15055;pid=15055
[1486983771] wproc: Registry request: name=Core Worker 15053;pid=15053
[1486983771] livestatus: Livestatus 1.2.8 by Mathias Kettner. Socket: '/usr/local/nagios/rw/mklive'
[1486983771] livestatus: Please visit us at http://mathias-kettner.de/
[1486983771] livestatus: Hint: please try out OMD - the Open Monitoring Distribution
[1486983771] livestatus: Please visit OMD at http://omdistro.org
[1486983781] livestatus: Finished initialization. Further log messages go to /usr/local/nagios/var/livestatus.log
[1486983781] Event broker module '/usr/local/lib/mk-livestatus/livestatus.o' initialized successfully.
[1486983781] ndomod: NDOMOD 2.1.2 (11-14-2016) Copyright (c) 2009 Nagios Core Development Team and Community Contributors
[1486983781] ndomod: Successfully connected to data sink.  0 queued items to flush.
[1486983781] ndomod registered for process data
[1486983781] ndomod registered for log data'
[1486983781] ndomod registered for system command data'
[1486983781] ndomod registered for event handler data'
[1486983781] ndomod registered for notification data'
[1486983781] ndomod registered for comment data'
[1486983781] ndomod registered for downtime data'
[1486983781] ndomod registered for flapping data'
[1486983781] ndomod registered for program status data'
[1486983781] ndomod registered for host status data'
[1486983781] ndomod registered for service status data'
[1486983781] ndomod registered for adaptive program data'
[1486983781] ndomod registered for adaptive host data'
[1486983781] ndomod registered for adaptive service data'
[1486983781] ndomod registered for external command data'
[1486983781] ndomod registered for aggregated status data'
[1486983781] ndomod registered for retention data'
[1486983781] ndomod registered for contact data'
[1486983781] ndomod registered for contact notification data'
[1486983781] ndomod registered for acknowledgement data'
[1486983781] ndomod registered for state change data'
[1486983781] ndomod registered for contact status data'
[1486983781] ndomod registered for adaptive contact data'
[1486983781] Event broker module '/usr/local/nagios/bin/ndomod.o' initialized successfully.
[1486983785] Successfully launched command file worker with pid 15303


I think I have some problem with perfdata. The unique "log" is not updated is "perfdata.log" like I have showed you.
But I don't know what do.


asking off-topic: How can I put my posts only for registered users?

Regards.

Re: Graphs are not showing

Posted: Mon Feb 13, 2017 12:06 pm
by rkennedy

Code: Select all

[02-13-2017 10:04:52] NPCD: WARN: MAX load reached: load 21.930000/10.000000 at i=1
[02-13-2017 10:05:07] NPCD: DEBUG: load 22.210000/10.000000
[02-13-2017 10:05:07] NPCD: WARN: MAX load reached: load 22.210000/10.000000 at i=1
[02-13-2017 10:05:22] NPCD: DEBUG: load 21.670000/10.000000
This is generally an indicator to me as to what was going on. Can PM over a profile to myself and dwhitfield for us to review? I'd like to see what's going on with your system. (Admin -> System PRofile -> Download Profile)

Re: Graphs are not showing

Posted: Tue Feb 14, 2017 2:39 am
by redesgtt
Finally We restarted server and the issue has been resolved. But it is not the first time this happens. I wish it would not happen again in the future, becasue the only solution I have found until now is restart the server.

How can I send you my Profile? This forum thread is public. I would prefer only forum administrator were able to see it.

thanks

by the way, now, after restart server, the npcd.log shows:

Code: Select all

os/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1487057882.perfdata.host
[02-14-2017 08:38:21] NPCD: Processing file '1487057882.perfdata.host'
[02-14-2017 08:38:21] NPCD: DEBUG: load 3.980000/10.000000
[02-14-2017 08:38:21] NPCD: ThreadCounter 1/5 File is 1487057882.perfdata.service
[02-14-2017 08:38:21] NPCD: Regular File: 1487057882.perfdata.service
[02-14-2017 08:38:21] NPCD: A thread was started on thread_counter = 1
[02-14-2017 08:38:21] NPCD: Have to wait: Filecounter = 2 - thread_counter = 2
[02-14-2017 08:38:21] NPCD: Processing file 1487057882.perfdata.service with ID 140310110881536 - going to exec /usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1487057882.perfdata.service
[02-14-2017 08:38:21] NPCD: Processing file '1487057882.perfdata.service'
[02-14-2017 08:38:23] NPCD: No more files to process... waiting for 15 seconds
[02-14-2017 08:38:38] NPCD: Found 4 files in /var/nagiosramdisk/spool/perfdata/
[02-14-2017 08:38:38] NPCD: DEBUG: load 3.350000/10.000000
[02-14-2017 08:38:38] NPCD: ThreadCounter 0/5 File is .
[02-14-2017 08:38:38] NPCD: DEBUG: load 3.350000/10.000000
[02-14-2017 08:38:38] NPCD: ThreadCounter 0/5 File is ..
[02-14-2017 08:38:38] NPCD: DEBUG: load 3.350000/10.000000
[02-14-2017 08:38:38] NPCD: ThreadCounter 0/5 File is 1487057897.perfdata.host
[02-14-2017 08:38:38] NPCD: Regular File: 1487057897.perfdata.host
[02-14-2017 08:38:38] NPCD: A thread was started on thread_counter = 0
graphs are shown

Re: Graphs are not showing

Posted: Tue Feb 14, 2017 11:17 am
by rkennedy
It would appear be performance based, or something locking up. We would need to see a profile when everything has halted again. Feel free to PM them over to us if you'd like to avoid it being posted on the public forum.

Re: Graphs are not showing

Posted: Wed Mar 01, 2017 2:27 pm
by tmcdonald
Just checking in since we have not heard from you in a while. Did @rkennedy's post clear things up or has the issue otherwise been resolved?

Re: Graphs are not showing

Posted: Tue Apr 18, 2017 1:54 am
by redesgtt
tmcdonald wrote:Just checking in since we have not heard from you in a while. Did @rkennedy's post clear things up or has the issue otherwise been resolved?
Yes. Please close this thread.

Thanks a lot for your support