Page 1 of 2

No more windows disk metrics after move to mod_gearman

Posted: Mon Feb 16, 2015 1:05 pm
by vAJ
Noticed something very odd after I enabled mod_gearman on my last remaining instance that needed it.

All of my Windows disk metrics are gone. I had several reports setup for disk usage using the metrics function against several service groups. These are all broken now showing "no matching data to display".

This is a major blow to our operations staff that use these reports to proactively control hundreds of systems on our distributed platform. Need to see about getting this functionality back ASAP.

-Andrew

Re: No more windows disk metrics after move to mod_gearman

Posted: Mon Feb 16, 2015 2:21 pm
by lmiltchev
What is the version of Nagios XI and Mod Gearman that you are using? Did you follow this document to integrate Mod Gearman to XI?

Can you show us a screenshot of the error that you are getting and the actual command run from the CLI along with the output of it?

Re: No more windows disk metrics after move to mod_gearman

Posted: Mon Feb 16, 2015 3:43 pm
by vAJ
This particular instance is still on 2014R2.0. Gearman 1.5 install by the book.

I have a service group for each of the drive letter service checks for these large groups of servers. c:, d:, l:, p:, etc. Take D: for example, the report for this would help the operations staff see which web servers' data drives were inching towards their thresholds and they could go proactively look for problems before they began alerting.
d_report_no-data.JPG

Re: No more windows disk metrics after move to mod_gearman

Posted: Mon Feb 16, 2015 3:59 pm
by vAJ
I checked by other production instance and test instance (both running 2014R2.6) with gearman and they both also have no metrics for windows servers running NSClient. Double checking to see if the same problem exists for windows servers with NCPA.

Re: No more windows disk metrics after move to mod_gearman

Posted: Mon Feb 16, 2015 5:33 pm
by abrist
Are the related performance graphs still populated?

Re: No more windows disk metrics after move to mod_gearman

Posted: Mon Feb 16, 2015 5:45 pm
by vAJ
Oh yeah, the actual service checks work great. Graphs look good in every method of access (graphexplorer, quick 24hr popup from host/service summary, and from perf graphs tab on host detail).

Just through the metrics function, doesn't like windows perf stats. All of my linux servers with NRPE still show up in disk, mem, cpu metrics reports.

Re: No more windows disk metrics after move to mod_gearman

Posted: Mon Feb 16, 2015 6:05 pm
by abrist
Did you change the service description of the disk checks when you moved to gearman?

Re: No more windows disk metrics after move to mod_gearman

Posted: Mon Feb 16, 2015 6:17 pm
by vAJ
Nope. Nothing changed when I added gearman. If I run the metrics report for disk usage with no filter for service group, serves up all of the handful of linux servers I have at that site. Over 1k windows VMs on this instance with maybe 100 linux.
linux_disk_usage.JPG

Re: No more windows disk metrics after move to mod_gearman

Posted: Mon Feb 16, 2015 6:41 pm
by Box293
vAJ wrote:I have a service group for each of the drive letter service checks for these large groups of servers. c:, d:, l:, p:, etc. Take D: for example, the report for this would help the operations staff see which web servers' data drives were inching towards their thresholds and they could go proactively look for problems before they began alerting.

Image
Can you run this on Nagios XI in an ssh session:

Code: Select all

tail -f /var/log/httpd/error_log
Now go and run that report and post back any errors that appear in the ssh session.

Re: No more windows disk metrics after move to mod_gearman

Posted: Tue Feb 17, 2015 10:23 am
by vAJ

Code: Select all

[Mon Feb 16 23:06:43 2015] [error] [client ::1] PHP Notice:  Undefined index: language in /usr/local/nagiosxi/html/includes/components/ccm/includes/common_functions.inc.php on line 710
[Mon Feb 16 23:06:43 2015] [error] [client ::1] PHP Notice:  Undefined index: language in /usr/local/nagiosxi/html/includes/components/ccm/includes/common_functions.inc.php on line 711
[Mon Feb 16 23:06:46 2015] [notice] caught SIGTERM, shutting down
[Mon Feb 16 23:06:47 2015] [notice] suEXEC mechanism enabled (wrapper: /usr/sbin/suexec)
[Mon Feb 16 23:06:47 2015] [notice] Digest: generating secret for digest authentication ...
[Mon Feb 16 23:06:47 2015] [notice] Digest: done
[Mon Feb 16 23:06:48 2015] [notice] Apache/2.2.15 (Unix) DAV/2 PHP/5.3.3 mod_ssl/2.2.15 OpenSSL/1.0.1e-fips configured -- resuming normal operations
[Tue Feb 17 14:37:21 2015] [error] [client 10.20.44.36] PHP Warning:  ldap_start_tls(): Unable to start TLS: Operations error in /usr/local/nagiosxi/html/includes/components/active_directory/adLDAP/adLDAP.php on line 372, referer: https://nagios.vdc.volusion.com/nagiosxi/login.php?logout
[Tue Feb 17 15:21:23 2015] [error] [client 10.20.40.53] PHP Warning:  ldap_start_tls(): Unable to start TLS: Operations error in /usr/local/nagiosxi/html/includes/components/active_directory/adLDAP/adLDAP.php on line 372, referer: https://nagios.vdc.volusion.com/nagiosxi/login.php?redirect=/nagiosxi/index.php%3f&noauth=1
Pulled this right after hitting the metrics function