Page 2 of 3
Re: Elasticsearch-health-monitoring plugin
Posted: Tue Nov 17, 2015 7:09 pm
by jolson
Box293 wrote:jolson wrote:My perfdata actually has five semicolons as well - I don't think that is the problem.
I believe newer versions of rrdtool is particular about this. I think it starts when you implement rrdcached as that installs a newer version of rrdtool than what is shipped with XI.
I'm pretty sure if
@willemdh increases the logging verbosity then he would see the data being discarded in the logs:
http://support.nagios.com/wiki/index.ph ... leshooting
Code: Select all
tail -f /usr/local/nagios/var/perfdata.log > /tmp/perfdata.txt
tail -f /usr/local/nagios/var/npcd.log > /tmp/npcd.txt
Interesting! Thanks for the knowledge as always.
Re: Elasticsearch-health-monitoring plugin
Posted: Wed Nov 18, 2015 2:45 am
by WillemDH
Hmm, I'm not sure I ever installed rrdcached.
Found this procedure:
https://www.google.be/url?sa=t&rct=j&q= ... 3241,d.d2s
Code: Select all
ls /etc/sysconfig/rrdcached
ls: cannot access /etc/sysconfig/rrdcached: No such file or directory
Also RRD_DAEMON_OPTS is commented in /usr/local/nagios/etc/pnp/process_perfdata.cfg
Code: Select all
# RRD_DAEMON_OPTS = unix:/tmp/rrdcached.sock
Please also note that I'm using a ramdisk. Tailed the perfdata and npcd logfile after setting debug level, but nothing recently appeared in it, even after a few checks. Gonna try recreating the service.
EDIT: Recreated the service, again, copy of a service where perfdata is working fine. But same issue, no graphs.
EDIT 2: Ok, i'll post the intersting stuff from the npcd and perfdata logs. maybe you guys know better what it means
Code: Select all
cat npcd.txt
[11-18-2015 09:07:31] NPCD: A thread was started on thread_counter = 0
[11-18-2015 09:07:31] NPCD: Processing file 1447834033.perfdata.host with ID 139688456226560 - going to exec /usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1447834033.perfdata.host
[11-18-2015 09:07:31] NPCD: DEBUG: load 1.340000/20.000000
[11-18-2015 09:07:31] NPCD: Processing file '1447834033.perfdata.host'
[11-18-2015 09:07:31] NPCD: ThreadCounter 1/5 File is 1447834033.perfdata.service
[11-18-2015 09:07:31] NPCD: Regular File: 1447834033.perfdata.service
[11-18-2015 09:07:31] NPCD: A thread was started on thread_counter = 1
[11-18-2015 09:07:31] NPCD: Processing file 1447834033.perfdata.service with ID 139688375482112 - going to exec /usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1447834033.perfdata.service
[11-18-2015 09:07:31] NPCD: Have to wait: Filecounter = 2 - thread_counter = 2
[11-18-2015 09:07:31] NPCD: Processing file '1447834033.perfdata.service'
[11-18-2015 09:07:34] NPCD: No more files to process... waiting for 15 seconds
[11-18-2015 09:07:49] NPCD: Found 4 files in /var/nagiosramdisk/spool/perfdata/
[11-18-2015 09:07:49] NPCD: DEBUG: load 1.430000/20.000000
[11-18-2015 09:07:49] NPCD: ThreadCounter 0/5 File is .
[11-18-2015 09:07:49] NPCD: DEBUG: load 1.430000/20.000000
[11-18-2015 09:07:49] NPCD: ThreadCounter 0/5 File is ..
[11-18-2015 09:07:49] NPCD: DEBUG: load 1.430000/20.000000
[11-18-2015 09:07:49] NPCD: ThreadCounter 0/5 File is 1447834048.perfdata.host
[11-18-2015 09:07:49] NPCD: Regular File: 1447834048.perfdata.host
[11-18-2015 09:07:49] NPCD: A thread was started on thread_counter = 0
[11-18-2015 09:07:49] NPCD: Processing file 1447834048.perfdata.host with ID 139688456226560 - going to exec /usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1447834048.perfdata.host
[11-18-2015 09:07:49] NPCD: DEBUG: load 1.430000/20.000000
[11-18-2015 09:07:49] NPCD: ThreadCounter 1/5 File is 1447834049.perfdata.service
[11-18-2015 09:07:49] NPCD: Processing file '1447834048.perfdata.host'
[11-18-2015 09:07:49] NPCD: Regular File: 1447834049.perfdata.service
[11-18-2015 09:07:49] NPCD: A thread was started on thread_counter = 1
[11-18-2015 09:07:49] NPCD: Processing file 1447834049.perfdata.service with ID 139688375482112 - going to exec /usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1447834049.perfdata.service
[11-18-2015 09:07:49] NPCD: Processing file '1447834049.perfdata.service'
[11-18-2015 09:07:49] NPCD: Have to wait: Filecounter = 2 - thread_counter = 2
[11-18-2015 09:07:53] NPCD: No more files to process... waiting for 15 seconds
[11-18-2015 09:08:08] NPCD: Found 6 files in /var/nagiosramdisk/spool/perfdata/
[11-18-2015 09:08:08] NPCD: DEBUG: load 1.450000/20.000000
[11-18-2015 09:08:08] NPCD: ThreadCounter 0/5 File is .
[11-18-2015 09:08:08] NPCD: DEBUG: load 1.450000/20.000000
[11-18-2015 09:08:08] NPCD: ThreadCounter 0/5 File is ..
[11-18-2015 09:08:08] NPCD: DEBUG: load 1.450000/20.000000
[11-18-2015 09:08:08] NPCD: ThreadCounter 0/5 File is 1447834063.perfdata.host
[11-18-2015 09:08:08] NPCD: Regular File: 1447834063.perfdata.host
[11-18-2015 09:08:08] NPCD: A thread was started on thread_counter = 0
[11-18-2015 09:08:08] NPCD: Processing file 1447834063.perfdata.host with ID 139688456226560 - going to exec /usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1447834063.perfdata.host
[11-18-2015 09:08:08] NPCD: Processing file '1447834063.perfdata.host'
[11-18-2015 09:08:08] NPCD: DEBUG: load 1.450000/20.000000
[11-18-2015 09:08:08] NPCD: ThreadCounter 1/5 File is 1447834063.perfdata.service
[11-18-2015 09:08:08] NPCD: Regular File: 1447834063.perfdata.service
[11-18-2015 09:08:08] NPCD: A thread was started on thread_counter = 1
[11-18-2015 09:08:08] NPCD: Processing file 1447834063.perfdata.service with ID 139688375482112 - going to exec /usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1447834063.perfdata.service
[11-18-2015 09:08:08] NPCD: Processing file '1447834063.perfdata.service'
[11-18-2015 09:08:08] NPCD: DEBUG: load 1.450000/20.000000
[11-18-2015 09:08:08] NPCD: ThreadCounter 2/5 File is 1447834079.perfdata.host
[11-18-2015 09:08:08] NPCD: Regular File: 1447834079.perfdata.host
[11-18-2015 09:08:08] NPCD: A thread was started on thread_counter = 2
[11-18-2015 09:08:08] NPCD: Processing file 1447834079.perfdata.host with ID 139688364992256 - going to exec /usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1447834079.perfdata.host
[11-18-2015 09:08:08] NPCD: DEBUG: load 1.450000/20.000000
[11-18-2015 09:08:08] NPCD: Processing file '1447834079.perfdata.host'
[11-18-2015 09:08:08] NPCD: ThreadCounter 3/5 File is 1447834079.perfdata.service
[11-18-2015 09:08:08] NPCD: Regular File: 1447834079.perfdata.service
[11-18-2015 09:08:08] NPCD: A thread was started on thread_counter = 3
[11-18-2015 09:08:08] NPCD: Processing file 1447834079.perfdata.service with ID 139688354502400 - going to exec /usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1447834079.perfdata.service
[11-18-2015 09:08:08] NPCD: Processing file '1447834079.perfdata.service'
[11-18-2015 09:08:08] NPCD: Have to wait: Filecounter = 4 - thread_counter = 4
[11-18-2015 09:08:11] NPCD: No more files to process... waiting for 15 seconds
[11-18-2015 09:08:26] NPCD: Found 4 files in /var/nagiosramdisk/spool/perfdata/
[11-18-2015 09:08:26] NPCD: DEBUG: load 1.310000/20.000000
[11-18-2015 09:08:26] NPCD: ThreadCounter 0/5 File is .
[11-18-2015 09:08:26] NPCD: DEBUG: load 1.310000/20.000000
[11-18-2015 09:08:26] NPCD: ThreadCounter 0/5 File is ..
[11-18-2015 09:08:26] NPCD: DEBUG: load 1.310000/20.000000
[11-18-2015 09:08:26] NPCD: ThreadCounter 0/5 File is 1447834093.perfdata.host
[11-18-2015 09:08:26] NPCD: Regular File: 1447834093.perfdata.host
[11-18-2015 09:08:26] NPCD: A thread was started on thread_counter = 0
[11-18-2015 09:08:26] NPCD: Processing file 1447834093.perfdata.host with ID 139688456226560 - going to exec /usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1447834093.perfdata.host
[11-18-2015 09:08:26] NPCD: DEBUG: load 1.310000/20.000000
[11-18-2015 09:08:26] NPCD: Processing file '1447834093.perfdata.host'
[11-18-2015 09:08:26] NPCD: ThreadCounter 1/5 File is 1447834093.perfdata.service
[11-18-2015 09:08:26] NPCD: Regular File: 1447834093.perfdata.service
[11-18-2015 09:08:26] NPCD: A thread was started on thread_counter = 1
[11-18-2015 09:08:26] NPCD: Processing file 1447834093.perfdata.service with ID 139688375482112 - going to exec /usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1447834093.perfdata.service
[11-18-2015 09:08:26] NPCD: Have to wait: Filecounter = 2 - thread_counter = 2
[11-18-2015 09:08:26] NPCD: Processing file '1447834093.perfdata.service'
[11-18-2015 09:08:30] NPCD: No more files to process... waiting for 15 seconds
[11-18-2015 09:08:45] NPCD: Found 4 files in /var/nagiosramdisk/spool/perfdata/
[11-18-2015 09:08:45] NPCD: DEBUG: load 1.020000/20.000000
[11-18-2015 09:08:45] NPCD: ThreadCounter 0/5 File is .
[11-18-2015 09:08:45] NPCD: DEBUG: load 1.020000/20.000000
[11-18-2015 09:08:45] NPCD: ThreadCounter 0/5 File is ..
[11-18-2015 09:08:45] NPCD: DEBUG: load 1.020000/20.000000
[11-18-2015 09:08:45] NPCD: ThreadCounter 0/5 File is 1447834108.perfdata.host
[11-18-2015 09:08:45] NPCD: Regular File: 1447834108.perfdata.host
[11-18-2015 09:08:45] NPCD: A thread was started on thread_counter = 0
[11-18-2015 09:08:45] NPCD: Processing file 1447834108.perfdata.host with ID 139688456226560 - going to exec /usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1447834108.perfdata.host
[11-18-2015 09:08:45] NPCD: DEBUG: load 1.020000/20.000000
[11-18-2015 09:08:45] NPCD: Processing file '1447834108.perfdata.host'
[11-18-2015 09:08:45] NPCD: ThreadCounter 1/5 File is 1447834108.perfdata.service
[11-18-2015 09:08:45] NPCD: Regular File: 1447834108.perfdata.service
[11-18-2015 09:08:45] NPCD: A thread was started on thread_counter = 1
[11-18-2015 09:08:45] NPCD: Processing file 1447834108.perfdata.service with ID 139688375482112 - going to exec /usr/local/nagios/libexec/process_perfdata.pl -n -b /var/nagiosramdisk/spool/perfdata//1447834108.perfdata.service
[11-18-2015 09:08:45] NPCD: Processing file '1447834108.perfdata.service'
[11-18-2015 09:08:45] NPCD: Have to wait: Filecounter = 2 - thread_counter = 2
Code: Select all
cat perfdata.txt | grep "nagioslogserver01"
2015-11-18 09:08:08 [64602] [1] Found Performance Data for nagioslogserver01.gentgrp.gent.be / _HOST_ (rta=0.385ms;3000.000;5000.000;0; pl=0%;80;100;; rtmax=0.982ms;;;; rtmin=0.214ms;;;;)
2015-11-18 09:08:08 [64602] [2] RRDs::update /usr/local/nagios/share/perfdata/nagioslogserver01.gentgrp.gent.be/_HOST_.rrd 1447834058:0.385:0:0.982:0.214
2015-11-18 09:08:08 [64602] [2] /usr/local/nagios/share/perfdata/nagioslogserver01.gentgrp.gent.be/_HOST_.rrd updated
2015-11-18 09:08:08 [64608] [1] Found Performance Data for nagioslogserver01.gentgrp.gent.be / INF_Elasticsearch (cluster_nodes=2;;;;; cluster_master_eligible_nodes=2;;;;; cluster_data_nodes=2;;;;; cluster_active_shards=342;;;;; cluster_relocating_shards=0;;;;; cluster_initialising_shards=0;;;;; cluster_unassigned_shards=0;;;;; cluster_total_shards=342;;;;; cluster_total_indices=35;;;;; cluster_closed_indices=0;;;;; storesize=454154063698B;;;;; documents=489811417c;;;;; index_ops=37211961c;;;;; index_time=21123931ms;;;;; flush_ops=669c;;;;; flush_time=80921ms;;;;; throttle_time=8421121ms;;;;; index_ops=37211961c;;;;; index_time=21123931ms;;;;; delete_ops=0c;;;;; delete_time=0ms;;;;; get_ops=26644c;;;;; get_time=6913ms;;;;; exists_ops=26591c;;;;; exists_time=6909ms;;;;; missing_ops=53c;;;;; missing_time=4ms;;;;; query_ops=64166c;;;;; query_time=269146ms;;;;; fetch_ops=63899c;;;;; fetch_time=10188ms;;;;; merge_ops=33592c;;;;; refresh_ops=171103c;;;;; refresh_time=6128534ms;;;;; gc_old_count=5c;;;;; gc_young_count=23285c;;;;; heap_used=42%;;;;;)
2015-11-18 09:08:09 [64608] [1] Found Performance Data for nagioslogserver01.gentgrp.gent.be / HEA_Elasticsearch (cluster_nodes=2;;;;; cluster_master_eligible_nodes=2;;;;; cluster_data_nodes=2;;;;; cluster_active_shards=342;;;;; cluster_relocating_shards=0;;;;; cluster_initialising_shards=0;;;;; cluster_unassigned_shards=0;;;;; cluster_total_shards=342;;;;; cluster_total_indices=35;;;;; cluster_closed_indices=0;;;;; storesize=454154063698B;;;;; documents=489813216c;;;;; index_ops=37215700c;;;;; index_time=21125926ms;;;;; flush_ops=669c;;;;; flush_time=80921ms;;;;; throttle_time=8421121ms;;;;; index_ops=37215700c;;;;; index_time=21125926ms;;;;; delete_ops=0c;;;;; delete_time=0ms;;;;; get_ops=26648c;;;;; get_time=6914ms;;;;; exists_ops=26595c;;;;; exists_time=6910ms;;;;; missing_ops=53c;;;;; missing_time=4ms;;;;; query_ops=64170c;;;;; query_time=269149ms;;;;; fetch_ops=63903c;;;;; fetch_time=10189ms;;;;; merge_ops=33593c;;;;; refresh_ops=171109c;;;;; refresh_time=6128781ms;;;;; gc_old_count=5c;;;;; gc_young_count=23286c;;;;; heap_used=43%;;;;;)
2015-11-18 09:08:09 [64604] [1] Found Performance Data for nagioslogserver01.gentgrp.gent.be / HEA_Elasticsearch (cluster_nodes=2;;;;; cluster_master_eligible_nodes=2;;;;; cluster_data_nodes=2;;;;; cluster_active_shards=342;;;;; cluster_relocating_shards=0;;;;; cluster_initialising_shards=0;;;;; cluster_unassigned_shards=0;;;;; cluster_total_shards=342;;;;; cluster_total_indices=35;;;;; cluster_closed_indices=0;;;;; storesize=454151730978B;;;;; documents=489809464c;;;;; index_ops=37209965c;;;;; index_time=21122944ms;;;;; flush_ops=669c;;;;; flush_time=80921ms;;;;; throttle_time=8420594ms;;;;; index_ops=37209965c;;;;; index_time=21122944ms;;;;; delete_ops=0c;;;;; delete_time=0ms;;;;; get_ops=26644c;;;;; get_time=6913ms;;;;; exists_ops=26591c;;;;; exists_time=6909ms;;;;; missing_ops=53c;;;;; missing_time=4ms;;;;; query_ops=64164c;;;;; query_time=269145ms;;;;; fetch_ops=63897c;;;;; fetch_time=10188ms;;;;; merge_ops=33590c;;;;; refresh_ops=171098c;;;;; refresh_time=6128294ms;;;;; gc_old_count=5c;;;;; gc_young_count=23284c;;;;; heap_used=42%;;;;;)
2015-11-18 09:08:27 [65263] [1] Found Performance Data for nagioslogserver01.gentgrp.gent.be / DRV_Root (/=69317MB;154232;162800;0;171369)
2015-11-18 09:08:27 [65263] [2] RRDs::update /usr/local/nagios/share/perfdata/nagioslogserver01.gentgrp.gent.be/DRV_Root.rrd 1447834081:69317
2015-11-18 09:08:27 [65263] [2] /usr/local/nagios/share/perfdata/nagioslogserver01.gentgrp.gent.be/DRV_Root.rrd updated
2015-11-18 09:08:27 [65263] [1] Found Performance Data for nagioslogserver01.gentgrp.gent.be / HEA_Elasticsearch (cluster_nodes=2;;;;; cluster_master_eligible_nodes=2;;;;; cluster_data_nodes=2;;;;; cluster_active_shards=342;;;;; cluster_relocating_shards=0;;;;; cluster_initialising_shards=0;;;;; cluster_unassigned_shards=0;;;;; cluster_total_shards=342;;;;; cluster_total_indices=35;;;;; cluster_closed_indices=0;;;;; storesize=454162234602B;;;;; documents=489819233c;;;;; index_ops=37220120c;;;;; index_time=21128434ms;;;;; flush_ops=669c;;;;; flush_time=80921ms;;;;; throttle_time=8421392ms;;;;; index_ops=37220120c;;;;; index_time=21128434ms;;;;; delete_ops=0c;;;;; delete_time=0ms;;;;; get_ops=26653c;;;;; get_time=6915ms;;;;; exists_ops=26600c;;;;; exists_time=6911ms;;;;; missing_ops=53c;;;;; missing_time=4ms;;;;; query_ops=64178c;;;;; query_time=269152ms;;;;; fetch_ops=63911c;;;;; fetch_time=10193ms;;;;; merge_ops=33597c;;;;; refresh_ops=171125c;;;;; refresh_time=6129498ms;;;;; gc_old_count=5c;;;;; gc_young_count=23289c;;;;; heap_used=41%;;;;;)
HEA_Elasticsearch is the first service and INF_Elasticsearch is the second service I created for testing. So the perfdata is effectively found (Found Performance Data for nagioslogserver01.gentgrp.gent.be / HEA_Elasticsearch) but how can I see if it is discarded?
Re: Elasticsearch-health-monitoring plugin
Posted: Wed Nov 18, 2015 5:50 pm
by Box293
Willlem,
Can you use my Performance Data Tool to view the RRD file to see if the data exists in the RRD file.
Re: Elasticsearch-health-monitoring plugin
Posted: Thu Nov 19, 2015 10:16 am
by WillemDH
I should have looked in the first place. The rrd files dont exist. I thought if I saw the perfdata in the advanced tab it was already in the rrd. Apparently not.
Re: Elasticsearch-health-monitoring plugin
Posted: Thu Nov 19, 2015 5:37 pm
by Box293
So I suggest editing the plugin and removing the extra ;
I see two different plugins were reference here so I don't know which one has the code that needs to be changed.
Re: Elasticsearch-health-monitoring plugin
Posted: Thu Nov 19, 2015 5:57 pm
by WillemDH
The one from Jesse.
Re: Elasticsearch-health-monitoring plugin
Posted: Thu Nov 19, 2015 6:06 pm
by Box293
So looking at the plugin:
Code: Select all
from nagioscheck import PerformanceMetric, Status
It uses a library to do the performance data stuff.
https://github.com/saj/pynagioscheck
Specifically the nagioscheck.py file:
https://github.com/saj/pynagioscheck/bl ... oscheck.py
Line 295 needs the last ; removed:
Current:
Should be:
If this fixes the problem we'll report this as an issue on the pynagioscheck GitHub project.
Re: Elasticsearch-health-monitoring plugin
Posted: Mon Nov 30, 2015 5:25 am
by WillemDH
Nice one Troy. Updating nagioscheck.py
with
did the trick. perdata is working as expected now. Weird that this did work with Jesse though. Will you make the GitHub issue on
https://github.com/saj/pynagioscheck?
Thanks all!
Re: Elasticsearch-health-monitoring plugin
Posted: Mon Nov 30, 2015 2:28 pm
by bwallace
Box293, thanks for the suggestions. Willem, would you say this thread is ready to be locked?
Re: Elasticsearch-health-monitoring plugin
Posted: Mon Nov 30, 2015 6:51 pm
by Box293
WillemDH wrote:Nice one Troy. Updating nagioscheck.py
with
did the trick. perdata is working as expected now. Weird that this did work with Jesse though. Will you make the GitHub issue on
https://github.com/saj/pynagioscheck?
Thanks all!
Sweet, glad my instincts where spot on with this
Issue created on GitHub:
https://github.com/saj/pynagioscheck/issues/2