How can monitoring gpu on a windows?

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
dimsum
Posts: 153
Joined: Thu Aug 15, 2013 6:05 pm

How can monitoring gpu on a windows?

Post by dimsum »

Hi There,

I have try to monitoring a gpu on a windows machine. But I a bit confused about how can I get a value from a remote host. It's a passive check (snmp trap?) or active check (nsc?).

I read this link:

https://exchange.nagios.org/directory/P ... 1503417619

and this

https://www.thomas-krenn.com/en/wiki/GP ... ing_Plugin

Please advice the solution.

Thank you.
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: How can monitoring gpu on a windows?

Post by mcapra »

dimsum wrote:It's a passive check (snmp trap?) or active check (nsc?).
The documentation you linked has it set up as an active check that is executed using the NRPE agent. You can reference the "Configuring the check_gpu_sensor Plugin" section for details regarding setting the check up in Nagios.

There's a lot more work involved prior to actually configuring the check in Nagios, but that's mostly non-Nagios related environment configuration/setup on the Windows machine.
Former Nagios employee
https://www.mcapra.com/
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: How can monitoring gpu on a windows?

Post by dwhitfield »

dimsum wrote: I have try to monitoring a gpu on a windows machine.
What exactly are you trying to monitor? temperature, fan speed, load, something else? The exchange is definitely the place to start looking, but if you don't find what you need there, you might want to take a look at github.
dimsum
Posts: 153
Joined: Thu Aug 15, 2013 6:05 pm

Re: How can monitoring gpu on a windows?

Post by dimsum »

Hi,

I focused on gpu temp, fan speed and load is't basically. I try connect with command then got a message

./check_gpu_sensor.pl -H 10.255.1.1

/usr/bin/perl: symbol lookup error: /usr/local/lib64/perl5/auto/nvidia/ml/bindings/bindings.so: undefined symbol: nvmlInit

How can I fix the error?

Thanks.
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: How can monitoring gpu on a windows?

Post by dwhitfield »

You may need to install the nvidia perl package http://search.cpan.org/~nvbinding/nvidi ... idia/ml.pm but ultimately, you'd likely to find more knowledgeable folks about that error on the nvidia forums...possibly perl forums. We can certainly continue to try to help, but we don't have access to every type of gpu here, so we may not be able to test.

Why GPU is this you are trying to monitor?
dimsum
Posts: 153
Joined: Thu Aug 15, 2013 6:05 pm

Re: How can monitoring gpu on a windows?

Post by dimsum »

Hi,

I have check a module is installed in perl. But when I run the check it's same error.

IO::CaptureOutput
List::Compare
Nagios::Monitoring::Plugin
Perl
nvidia::ml::bindings

My site is using gpu to compute a graphic work e.g. animation, video, 3d in cluster and they need to know about a temp, clock, load they are run on a windows group. For a routine job they are open on remote desktop and look a status from nvidia monitor or gpuz something like that.

Thank you.
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: How can monitoring gpu on a windows?

Post by dwhitfield »

dimsum wrote: nvidia::ml::bindings
What command are you using for this? I was able to install a nvidia::ml, so I wonder if that's what you need rather than nvidia::ml::bindings

Any cpan output could also be useful.
Locked