Page 1 of 1

Box293 Plugin Snapshot Monitor Not Working

Posted: Sat Aug 06, 2016 4:06 pm
by kwhogster
Nagios Core 4.1.1 on Ubuntu Server 16.04.01
VMA Host
ESXi 6.0 Host

I setup box293 plugin to monitor Guest snapshots

Code: Select all

#       Guest Snapshots
define service {
        host_name TGCS014
        service_description Guest Snapshots
        check_command box293_check_vmware_test!10.2.8.10!Guest_Snapshot!--guest!TGCS014!--warning!snapshot_age:5!--critical!snapshot_age:10
        initial_state u
        max_check_attempts 3
        check_interval 10
        retry_interval 7
        active_checks_enabled 1
        check_period 24x7
        register 1
}
define service {
        host_name TGCS015
        service_description Guest Snapshots
        check_command box293_check_vmware_test!10.2.8.10!Guest_Snapshot!--guest!TGCS015!--warning!snapshot_age:5!--critical!snapshot_age:10
        initial_state u
        max_check_attempts 3
        check_interval 10
        retry_interval 7
        active_checks_enabled 1
        check_period 24x7
        register 1
}
Now the first host shows this CRITICAL: ['TGCS014' (Notes: TGCS014) (Age: 108 (CRITICAL >= 10))]

The second host show this OK: No snapshots found

Using vSphere I ran a snapshot on the first host TGCS014 and in the datastore all the files .vmdk -delta.vmdk .vmsp and .vmsn was created with current dates

I can not figure out how box293 knows how to find which datastore the snapshots are in

It is strange that the first host is reporting differently than the second I have 10 other hosts that report the same as host 2

Any ideas?

Re: Box293 Plugin Snapshot Monitor Not Working

Posted: Mon Aug 08, 2016 2:54 am
by Box293
The plugin talks to the VMware SDK API, this is how it gets the snapshot information, it doesn't look directly at the datastores.
kwhogster wrote:Now the first host shows this CRITICAL: ['TGCS014' (Notes: TGCS014) (Age: 108 (CRITICAL >= 10))]
TGCS014 has a snapshot that is 108 days old, which is larger than the critical threshold of 10 days you defined.
kwhogster wrote:The second host show this OK: No snapshots found

It is strange that the first host is reporting differently than the second I have 10 other hosts that report the same as host 2
It should report any snapshots found, but only trigger the thresholds if they've been exceeded.

When you right click a TGCS015 in the vSphere client and select "Snapshots > Snapshot Manager" are there any snapshots listed?

Re: Box293 Plugin Snapshot Monitor Not Working

Posted: Mon Aug 08, 2016 8:12 pm
by kwhogster
Troy

See attached images

As you can see in TGCS014 I have a snapshot with a current date of 8/4/2016 today is 8/8/2016 this is under the criteria correct?

As you can see in TGCS015 I have no snapshots yet it is reporting correctly OK: No snapshots found

Do not understand how TGCS014 is OK: ['TGCS014' (Notes: TGCS014) (Age: 110)] The AGE 110 is totally incorrect

Thoughts

Re: Box293 Plugin Snapshot Monitor Not Working

Posted: Mon Aug 08, 2016 8:51 pm
by Box293
kwhogster wrote:As you can see in TGCS014 I have a snapshot with a current date of 8/4/2016 today is 8/8/2016 this is under the criteria correct?
You are looking at the VM files. If you take a snapshot of a VM then all changes from that point on are saved in a snapshot file, so yes the files you are looking at are going to be up to date because they are constantly being updated.

The plugin is NOT reporting on when then files were last updated. The plugin is reporting on when the snapshot was created. In your case the snapshot was created 108 days ago.

In a production environment having a VM with snapshots over a long time can cause performance issues. The purpose of the check in the plugin is there to detect snapshots that are exceeding a certain time period.

Re: Box293 Plugin Snapshot Monitor Not Working

Posted: Mon Aug 08, 2016 9:03 pm
by kwhogster
Troy

Should I remove that snapshot and lets see what happens ???

I have a NAS device on order should be here in a few days going to make that a new datastore on the ESXi host and place all the snapshots in there have ghettovcb installed ready to run

Re: Box293 Plugin Snapshot Monitor Not Working

Posted: Mon Aug 08, 2016 9:16 pm
by Box293
Yes you should commit the snapshot. Being so old, and depending on how much data has been written to it, it will probably take a few hours for the task to complete.

Re: Box293 Plugin Snapshot Monitor Not Working

Posted: Mon Aug 08, 2016 10:05 pm
by kwhogster
Troy

I think I might have found the issue

I looked at the snapshot manager and it shows two more snapshots but I do not know where they can be I do not see any in the datastore am I looking in the wrong place

See my image attached

Re: Box293 Plugin Snapshot Monitor Not Working

Posted: Mon Aug 08, 2016 10:10 pm
by Box293
It's reporting the OLDEST snapshot. You just need to click delete all and it will consolidate all the snapshots.

Re: Box293 Plugin Snapshot Monitor Not Working

Posted: Mon Aug 08, 2016 10:28 pm
by kwhogster
Troy

I just watched a VMware video on snapshots

I now see what to do

I was just deleting one at a time but it takes a long time to refresh the snapshot manager to point you to the You Are Here new spot

Will update later

Re: Box293 Plugin Snapshot Monitor Not Working

Posted: Mon Aug 08, 2016 10:49 pm
by kwhogster
Troy

Delete All completed successfully no snapshots now.

Nagios took a few minutes to update

now shows OK: No snapshots found


We can close this issue now

Thanks for your help on this