Monitoring Esxi host on nagios

Post by **Box293** » Wed Sep 30, 2015 1:35 pm

Can I confirm that the testing on the nagios box was done while "su nagios".

vinothg wrote:In my vcenter inventory esxi listed as fqdn only and in my service definition i have configured fqdn only. but i am getting an error.

What is the error?

vinothg · Post by **vinothg** » Thu Oct 08, 2015 10:10 pm

Hi,

Sorry it was my bad. I was made some mistakes in configuration, now i have changed and its working. Please guide me the below queries.

1) We wanted to set the warning and critical alerts if my esxi host cpu and memory usage reached 70 % and 80 %, so how to set this in service definition.

2) I have configured nagios monitoring server in under esxi one vm. So i believe if that esxi or that particular vm goes down we wont able to get the alerts. so is there any way to cluster it or any alternate for this .

Please provide your valuable comments.

Thank you.

jdalrymple · Post by **jdalrymple** » Fri Oct 09, 2015 1:48 pm

vinothg wrote:1) We wanted to set the warning and critical alerts if my esxi host cpu and memory usage reached 70 % and 80 %, so how to set this in service definition.

https://exchange.nagios.org/components/ ... 0&cf_id=29
This should be as easy as something like ./box293_check_vmware.pl --server <ESXi IP> --check Host_CPU_Usage -w 80 -c 90
Mind you you'll have to create the service such that you're using check_by_ssh to your vma, however we'll assume at this point you have that figured out?

vinothg wrote:2) I have configured nagios monitoring server in under esxi one vm. So i believe if that esxi or that particular vm goes down we wont able to get the alerts. so is there any way to cluster it or any alternate for this .

If VMware HA is available in your environment, it is probably the simplest option to handle the scenario you described.

We don't have any official way for you to handle this, but there are options. With NagiosXI we offer a component that stores system backups on a remote server, then you can just restore that remote server and bring it online using an event handler. There is also this new option from linbit:https://www.nagios.com/news/2015/10/pre ... nagios-xi/ - while meant for XI the constructs in that option work for Core also.

Here is our datasheet on distributing a NagiosXI installation, maybe you can translate some of the ideas in it to work in your environment:

https://assets.nagios.com/downloads/nag ... ios-XI.pdf

vinothg · Post by **vinothg** » Sat Oct 10, 2015 12:12 am

This should be as easy as something like ./box293_check_vmware.pl --server <ESXi IP> --check Host_CPU_Usage -w 80 -c 90
Mind you you'll have to create the service such that you're using check_by_ssh to your vma, however we'll assume at this point you have that figured out?

Thanks for your assistance. Yes i have created the service check_by_ssh to vma. I have configured the above in my service definition, but i am getting the below error.

current status: critical
status: return code of 9 is out of bounds

Please assist me how to proceed here.

Thank you.

vinothg · Post by **vinothg** » Sun Oct 11, 2015 12:35 am

Please assist me on the above queries.

Thank you.

vinothg · Post by **vinothg** » Sun Oct 11, 2015 12:42 pm

Hi,

After configuring the esxi in service definition, i am getting the below error. Could you please assist me on this

Service status status information
Host cpu usage Critical critical: host has an uptime of 0 seconds, cannot collect data!

But this host is up.

Thanks in advance.

Post by **Box293** » Sun Oct 11, 2015 11:28 pm

vinothg wrote:1) We wanted to set the warning and critical alerts if my esxi host cpu and memory usage reached 70 % and 80 %, so how to set this in service definition.

Please refer to the manual, it has clear examples that explain how to do this.

Code: Select all

box293_check_vmware.pl --check Host_CPU_Usage --server 192.168.1.211 --host 192.168.1.210 --warning cpu_used:2 --critical cpu_used:10

WARNING: Host CPU {Free: 12.5 GHz} {Used: 2.7 GHz (WARNING >= 2)} {Total: 15.2GHz}|'CPU Free'=12.5GHz 'CPU Used'=2.7GHz;2;10 'CPU Total'=15.2GHz [Host_CPU_Usage]

vinothg wrote:
This should be as easy as something like ./box293_check_vmware.pl --server <ESXi IP> --check Host_CPU_Usage -w 80 -c 90
Mind you you'll have to create the service such that you're using check_by_ssh to your vma, however we'll assume at this point you have that figured out?
Thanks for your assistance. Yes i have created the service check_by_ssh to vma. I have configured the above in my service definition, but i am getting the below error.

current status: critical
status: return code of 9 is out of bounds

Please assist me how to proceed here.

Thank you.

This error was caused because -w and -c are not valid arguments in my plugin.

vinothg wrote:Hi,

After configuring the esxi in service definition, i am getting the below error. Could you please assist me on this

Service status status information
Host cpu usage Critical critical: host has an uptime of 0 seconds, cannot collect data!

But this host is up.

Thanks in advance.

This is strange, however we can diagnose what is going on here.
Can you please SSH to the vMA appliance.
Run the box293_check_vmware.pl command with all the arguments you are using AND add the --debug argument at the end.

command --debug

This will create the file /home/vi-admin/box293_check_vmware_debug_log.txt
Please email/PM me that file and I'll investigate some more.

vinothg · Post by **vinothg** » Mon Oct 12, 2015 12:34 pm

Hi,

After configuring the esxi in service definition, i am getting the below error. Could you please assist me on this

Service status status information
Host cpu usage Critical critical: host has an uptime of 0 seconds, cannot collect data!

But this host is up.

Thanks in advance.

I have figured out the above issue. Its because the vma sizing problem i have increased the memory and cpu specification and now its collecting the data.
Thanks for your support.

But still i am unable to figure out the below issue. I have checked the document still not sure where am missing. If you don't mind could you please assist me on the below issue.

vinothg wrote:
1) We wanted to set the warning and critical alerts if my esxi host cpu and memory usage reached 70 % and 80 %, so how to set this in service definition.box293_check_vmware.pl --check Host_CPU_Usage --server 192.168.1.211 --host 192.168.1.210 --warning cpu_used:2 --critical cpu_used:10

WARNING: Host CPU {Free: 12.5 GHz} {Used: 2.7 GHz (WARNING >= 2)} {Total: 15.2GHz}|'CPU Free'=12.5GHz 'CPU Used'=2.7GHz;2;10 'CPU Total'=15.2GHz [Host_CPU_Usage]

I am getting this error

Error: Service check command 'box293_check_vmware.pl --check Host_CPU_Usage --server 192.168.7.86 --host 192.168.7.12 --warning cpu_used:2 --critical cpu_used:5' specified in service 'Host Cpu Usage' for host 'hostname' not defined anywhere!

tmcdonald · Post by **tmcdonald** » Mon Oct 12, 2015 5:00 pm

What do you have configured as the command for the Host Cpu Usage service? Most likely you just need to either separate out the arguments properly, or add in the command to commands.cfg.

vinothg · Post by **vinothg** » Mon Oct 12, 2015 7:08 pm

Hi,

This is my commands.cfg configuration
define command {
command_name box293_check_vmware
command_line $USER1$/check_by_ssh -E 1 -t 90 -l vi-admin -H 192.168.5.63 -C "nice -n19 ~/box293_check_vmware.pl --server $ARG1$ --check $ARG2$ \"$ARG3$\" \"$ARG4$\" \"$ARG5$\" \"$ARG6$\" \"$ARG7$\" \"$ARG8$\""
}

Nagios Support Forum

Monitoring Esxi host on nagios

Re: Monitoring Esxi host on nagios

Re: Monitoring Esxi host on nagios

Re: Monitoring Esxi host on nagios

Re: Monitoring Esxi host on nagios

Re: Monitoring Esxi host on nagios

Re: Monitoring Esxi host on nagios

Re: Monitoring Esxi host on nagios

Re: Monitoring Esxi host on nagios

Re: Monitoring Esxi host on nagios

Re: Monitoring Esxi host on nagios