check_esx3.pl output truncated in Nagios/ CPU Load
check_esx3.pl output truncated in Nagios/ CPU Load
XI 2012R2.9 Enterprise Edition
RHEL 6.5
Offloaded DB
Firefox 30.0
Output from command line in XI server, not truncated
[nagios@nagiosprodxi1 cfgprep]$ /usr/local/nagios/libexec/check_esx3.pl -H 10.10.10.10 -u rrrrr -p ppppppp -l VMFS -w 70% -c 90%
ESX3 CRITICAL - storages : datastore1=280270.00 MB (99.80%), ADT1-TransportNode01-1-VNX7500=146616.00 MB (7.16%), ADT1-TransportNode01-2-VNX7500=224456.00 MB (14.62%), ADT1-TransportNode02-VNX7500=445628.00 MB (21.76%), ADT1-TransportNode03-VNX7500=445626.00 MB (21.76%), ADT1-ControlNode-VNX7500=384112.00 MB (18.76%) | datastore1=99.80%;70;90 ADT1-TransportNode01-1-VNX7500=7.16%;70;90 ADT1-TransportNode01-2-VNX7500=14.62%;70;90 ADT1-TransportNode02-VNX7500=21.76%;70;90 ADT1-TransportNode03-VNX7500=21.76%;70;90 ADT1-ControlNode-VNX7500=18.76%;70;90
Truncated Output in XI screen Please advice how to fix this issue. Thanks
Another problem, is the very high CPU load.
When check_esx3.pl runs it hogs the CPU
RHEL 6.5
Offloaded DB
Firefox 30.0
Output from command line in XI server, not truncated
[nagios@nagiosprodxi1 cfgprep]$ /usr/local/nagios/libexec/check_esx3.pl -H 10.10.10.10 -u rrrrr -p ppppppp -l VMFS -w 70% -c 90%
ESX3 CRITICAL - storages : datastore1=280270.00 MB (99.80%), ADT1-TransportNode01-1-VNX7500=146616.00 MB (7.16%), ADT1-TransportNode01-2-VNX7500=224456.00 MB (14.62%), ADT1-TransportNode02-VNX7500=445628.00 MB (21.76%), ADT1-TransportNode03-VNX7500=445626.00 MB (21.76%), ADT1-ControlNode-VNX7500=384112.00 MB (18.76%) | datastore1=99.80%;70;90 ADT1-TransportNode01-1-VNX7500=7.16%;70;90 ADT1-TransportNode01-2-VNX7500=14.62%;70;90 ADT1-TransportNode02-VNX7500=21.76%;70;90 ADT1-TransportNode03-VNX7500=21.76%;70;90 ADT1-ControlNode-VNX7500=18.76%;70;90
Truncated Output in XI screen Please advice how to fix this issue. Thanks
Another problem, is the very high CPU load.
When check_esx3.pl runs it hogs the CPU
You do not have the required permissions to view the files attached to this post.
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
RHEL 6 & 7
rrdcached & ramdisk optimisation
-
slansing
- Posts: 7698
- Joined: Mon Apr 23, 2012 4:28 pm
- Location: Travelling through time and space...
Re: check_esx3.pl output truncated in Nagios/ CPU Load
You can use '-h' on that plugin to get the full help output to sculpt your command:
Looks like just passing vmfs is giving you all datastore info, it looks like there should be a way to narrow that down I'll have to look around a bit. Something else you may want to take a look at is Box293's VMA hook-in vmware monitoring plugin:
http://exchange.nagios.org/directory/Pl ... re/details
The esx3 plugin does take quite a bit of resources based on how long you are querying the system, and for how much data. What type of hardware does your XI server have under it? You mentioned that you had a lot of users, which will also increase the load based on what they may be doing "running reports" etc.
Code: Select all
/usr/local/nagios/libexec/check_esx3.pl -hhttp://exchange.nagios.org/directory/Pl ... re/details
The esx3 plugin does take quite a bit of resources based on how long you are querying the system, and for how much data. What type of hardware does your XI server have under it? You mentioned that you had a lot of users, which will also increase the load based on what they may be doing "running reports" etc.
- Box293
- Too Basu
- Posts: 5126
- Joined: Sun Feb 07, 2010 10:55 pm
- Location: Deniliquin, Australia
- Contact:
Re: check_esx3.pl output truncated in Nagios/ CPU Load
To fix the truncated output we need to increase the size of the tables in Nagios XI.
Execute the following commands at the CLI:
The next time the check is executed the output will not be truncated in the XI interface.
As for the high CPU load, I recommend what slansing said and implement the box293_check_vmware plugin / solution as it offloads the plugin to a vMA appliance.
Execute the following commands at the CLI:
Code: Select all
echo "use nagios;alter table nagios_servicestatus modify output varchar(65535) not null;alter table nagios_servicestatus modify long_output varchar(65535) not null;alter table nagios_servicestatus modify perfdata varchar(65535) not null;" | mysql -pnagiosxi
echo "use nagios;alter table nagios_hoststatus modify output varchar(65535) not null;alter table nagios_servicestatus modify long_output varchar(65535) not null;alter table nagios_servicestatus modify perfdata varchar(65535) not null;" | mysql -pnagiosxi
echo "use nagios;alter table nagios_servicechecks modify output varchar(65535) not null;alter table nagios_servicestatus modify long_output varchar(65535) not null;alter table nagios_servicestatus modify perfdata varchar(65535) not null;" | mysql -pnagiosxi
echo "use nagios;alter table nagios_hostchecks modify output varchar(65535) not null;alter table nagios_servicestatus modify long_output varchar(65535) not null;alter table nagios_servicestatus modify perfdata varchar(65535) not null;" | mysql -pnagiosxi
As for the high CPU load, I recommend what slansing said and implement the box293_check_vmware plugin / solution as it offloads the plugin to a vMA appliance.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: check_esx3.pl output truncated in Nagios/ CPU Load
I can hardcode the Datastores but change management process needs to be perfect E2E which it is not.slansing wrote:You can use '-h' on that plugin to get the full help output to sculpt your command:
Looks like just passing vmfs is giving you all datastore info, it looks like there should be a way to narrow that down I'll have to look around a bit. Something else you may want to take a look at is Box293's VMA hook-in vmware monitoring plugin:Code: Select all
/usr/local/nagios/libexec/check_esx3.pl -h
http://exchange.nagios.org/directory/Pl ... re/details
The esx3 plugin does take quite a bit of resources based on how long you are querying the system, and for how much data. What type of hardware does your XI server have under it? You mentioned that you had a lot of users, which will also increase the load based on what they may be doing "running reports" etc.
My DB server is offloaded. XI is using 8 cores and 8GB ram and storage is on Raid 10.
Most of the time it is 70 - 80% idle when the check_esx3 is not running.
The users are only 2 currently. It is not fully deployed yet.
I will take a look at the plugin you recommended. Thanks
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
RHEL 6 & 7
rrdcached & ramdisk optimisation
Re: check_esx3.pl output truncated in Nagios/ CPU Load
Thanks for the assistance. I will check out the plugin.Box293 wrote:To fix the truncated output we need to increase the size of the tables in Nagios XI.
Execute the following commands at the CLI:
The next time the check is executed the output will not be truncated in the XI interface.Code: Select all
echo "use nagios;alter table nagios_servicestatus modify output varchar(65535) not null;alter table nagios_servicestatus modify long_output varchar(65535) not null;alter table nagios_servicestatus modify perfdata varchar(65535) not null;" | mysql -pnagiosxi echo "use nagios;alter table nagios_hoststatus modify output varchar(65535) not null;alter table nagios_servicestatus modify long_output varchar(65535) not null;alter table nagios_servicestatus modify perfdata varchar(65535) not null;" | mysql -pnagiosxi echo "use nagios;alter table nagios_servicechecks modify output varchar(65535) not null;alter table nagios_servicestatus modify long_output varchar(65535) not null;alter table nagios_servicestatus modify perfdata varchar(65535) not null;" | mysql -pnagiosxi echo "use nagios;alter table nagios_hostchecks modify output varchar(65535) not null;alter table nagios_servicestatus modify long_output varchar(65535) not null;alter table nagios_servicestatus modify perfdata varchar(65535) not null;" | mysql -pnagiosxi
As for the high CPU load, I recommend what slansing said and implement the box293_check_vmware plugin / solution as it offloads the plugin to a vMA appliance.
BTW what is a vMA appliance? Sorry not well versed in this.
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
RHEL 6 & 7
rrdcached & ramdisk optimisation
- Box293
- Too Basu
- Posts: 5126
- Joined: Sun Feb 07, 2010 10:55 pm
- Location: Deniliquin, Australia
- Contact:
Re: check_esx3.pl output truncated in Nagios/ CPU Load
vMA = VMware vSphere Management Assistant.rajasegar wrote:BTW what is a vMA appliance?
This is a free VM provided by VMware to manage ESXi and ESX servers (and vCenter). ESXi is a locked down OS, so VMware created vMA to allow you to do a lot of stuff to ESXi servers remotely.
The plugin simply leverages the VMware SDK which runs on PERL, so by using the vMA as a base to run the plugin on makes it a lot easier as it already has everything installed.
When you run certain checks with your existing check_esx3 plugin, the SDK is obtaining a heck of a lot of information from vCenter as it is querying all of the datastores, which means it uses a lot more CPU and memory to get the job done. box293_check_vmware was wrtitten to be as efficient as possible, only obtaining the information it needs using the VMware SDK. However even with this efficient design, it still needs to be offloaded to vMA as there is too much CPU and memory overhead, which can at times cripple a Nagios server.rajasegar wrote:Most of the time it is 70 - 80% idle when the check_esx3 is not running.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: check_esx3.pl output truncated in Nagios/ CPU Load
1) Can your plugin be run on the Nagios server? It will take a very long time to go through the Security people, setup VM etc etc.Box293 wrote:vMA = VMware vSphere Management Assistant.rajasegar wrote:BTW what is a vMA appliance?
This is a free VM provided by VMware to manage ESXi and ESX servers (and vCenter). ESXi is a locked down OS, so VMware created vMA to allow you to do a lot of stuff to ESXi servers remotely.
The plugin simply leverages the VMware SDK which runs on PERL, so by using the vMA as a base to run the plugin on makes it a lot easier as it already has everything installed.
When you run certain checks with your existing check_esx3 plugin, the SDK is obtaining a heck of a lot of information from vCenter as it is querying all of the datastores, which means it uses a lot more CPU and memory to get the job done. box293_check_vmware was wrtitten to be as efficient as possible, only obtaining the information it needs using the VMware SDK. However even with this efficient design, it still needs to be offloaded to vMA as there is too much CPU and memory overhead, which can at times cripple a Nagios server.rajasegar wrote:Most of the time it is 70 - 80% idle when the check_esx3 is not running.
2) Will querying individual datastore reduce the load?
Thanks.
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
RHEL 6 & 7
rrdcached & ramdisk optimisation
- Box293
- Too Basu
- Posts: 5126
- Joined: Sun Feb 07, 2010 10:55 pm
- Location: Deniliquin, Australia
- Contact:
Re: check_esx3.pl output truncated in Nagios/ CPU Load
It's possible however there will be no support. I specifically designed this plugin to run from a vMA appliance. I've killed Nagios XI servers doing a small amount of VMware monitoring directly from the XI server so I don't want to be responsible for creating something that could bring down a Nagios server.rajasegar wrote:1) Can your plugin be run on the Nagios server?
Let me put a scenario to you. All the air conditioners in the server room stop working. Temperature goes up. Nagios happens to be overloaded with VMware checks and the temperature monitoring checks do not happen and alerts do not get sent out. In a matter of 15 minutes all of your equipment shuts down due to overheating.rajasegar wrote:It will take a very long time to go through the Security people, setup VM etc etc.
How are you going to explain that to your boss when you knew there was a solution available that could have prevented this?
It's possible however you would need to go and read the code in the check_esx3.rajasegar wrote:2) Will querying individual datastore reduce the load?
Honestly, it's not about one particular check, but when multiple VMware checks are running at the same time, they have a chain-reaction type of effect. This is why offloading it make sense.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: check_esx3.pl output truncated in Nagios/ CPU Load
Let me see if I can do something about concurrency with dependencies and I will check out the monitoring locally in Dev.Box293 wrote:It's possible however there will be no support. I specifically designed this plugin to run from a vMA appliance. I've killed Nagios XI servers doing a small amount of VMware monitoring directly from the XI server so I don't want to be responsible for creating something that could bring down a Nagios server.rajasegar wrote:1) Can your plugin be run on the Nagios server?
Let me put a scenario to you. All the air conditioners in the server room stop working. Temperature goes up. Nagios happens to be overloaded with VMware checks and the temperature monitoring checks do not happen and alerts do not get sent out. In a matter of 15 minutes all of your equipment shuts down due to overheating.rajasegar wrote:It will take a very long time to go through the Security people, setup VM etc etc.
How are you going to explain that to your boss when you knew there was a solution available that could have prevented this?
It's possible however you would need to go and read the code in the check_esx3.rajasegar wrote:2) Will querying individual datastore reduce the load?
Honestly, it's not about one particular check, but when multiple VMware checks are running at the same time, they have a chain-reaction type of effect. This is why offloading it make sense.
The VMA thing as mentioned will take some time.
Thanks
5 x Nagios 5.6.9 Enterprise Edition
RHEL 6 & 7
rrdcached & ramdisk optimisation
RHEL 6 & 7
rrdcached & ramdisk optimisation
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: check_esx3.pl output truncated in Nagios/ CPU Load
As an FYI, this code does have a slight bug in it and should readBox293 wrote:To fix the truncated output we need to increase the size of the tables in Nagios XI.
Execute the following commands at the CLI:
The next time the check is executed the output will not be truncated in the XI interface.Code: Select all
echo "use nagios;alter table nagios_servicestatus modify output varchar(65535) not null;alter table nagios_servicestatus modify long_output varchar(65535) not null;alter table nagios_servicestatus modify perfdata varchar(65535) not null;" | mysql -pnagiosxi echo "use nagios;alter table nagios_hoststatus modify output varchar(65535) not null;alter table nagios_servicestatus modify long_output varchar(65535) not null;alter table nagios_servicestatus modify perfdata varchar(65535) not null;" | mysql -pnagiosxi echo "use nagios;alter table nagios_servicechecks modify output varchar(65535) not null;alter table nagios_servicestatus modify long_output varchar(65535) not null;alter table nagios_servicestatus modify perfdata varchar(65535) not null;" | mysql -pnagiosxi echo "use nagios;alter table nagios_hostchecks modify output varchar(65535) not null;alter table nagios_servicestatus modify long_output varchar(65535) not null;alter table nagios_servicestatus modify perfdata varchar(65535) not null;" | mysql -pnagiosxi
As for the high CPU load, I recommend what slansing said and implement the box293_check_vmware plugin / solution as it offloads the plugin to a vMA appliance.
Code: Select all
echo "use nagios;alter table nagios_servicestatus modify output varchar(65535) not null;alter table nagios_servicestatus modify long_output varchar(65535) not null;alter table nagios_servicestatus modify perfdata varchar(65535) not null;" | mysql -pnagiosxi
echo "use nagios;alter table nagios_hoststatus modify output varchar(65535) not null;alter table nagios_hoststatus modify long_output varchar(65535) not null;alter table nagios_hoststatus modify perfdata varchar(65535) not null;" | mysql -pnagiosxi
echo "use nagios;alter table nagios_servicechecks modify output varchar(65535) not null;alter table nagios_servicechecks modify long_output varchar(65535) not null;alter table nagios_servicechecks modify perfdata varchar(65535) not null;" | mysql -pnagiosxi
echo "use nagios;alter table nagios_hostchecks modify output varchar(65535) not null;alter table nagios_hostchecks modify long_output varchar(65535) not null;alter table nagios_hostchecks modify perfdata varchar(65535) not null;" | mysql -pnagiosxiCode: Select all
echo "
alter table nagios_servicestatus modify output varchar(65535) not null,modify long_output varchar(65535) not null,modify perfdata varchar(65535) not null;
alter table nagios_hoststatus modify output varchar(65535) not null, modify long_output varchar(65535) not null,modify perfdata varchar(65535) not null;
alter table nagios_servicechecks modify output varchar(65535) not null,modify long_output varchar(65535) not null,modify perfdata varchar(65535) not null;
alter table nagios_hostchecks modify output varchar(65535) not null,modify long_output varchar(65535) not null,modify perfdata varchar(65535) not null;
" | mysql -pnagiosxi nagios