It Is What It Is

Sometimes it just looks different.

Virtual Machine Snapshot Errors

This is a continuation of another post covering a similar process:  http://jeffdickman.com/2010/08/vmware-snapshots-and-virtual-disk-descriptor-problems

======================

In the previous article, I was explaining the process to recover the virtual disk descriptor file if it has been lost.  I ran into this issue when I was attempting to snapshot a server that had been running on an active snapshot for nearly 13 months.  When I powered it down, the snapshot failed and the server would not boot back up.  My troubleshooting identified that the virtual disk descriptor files had been deleted.  For a single virtual disk, with no snapshots, this probably isn’t a big deal.  The reality that I had 3 snapshots of this server and my best case recovery was 13 months ago, was unacceptable.  Onward to a solution.

As I state with almost everything, know and understand what you are working with and what you are doing.  VMware has a great Knowledge Base Documents around all of their products.  This should be your first step in pursuit of information.  Here are a few that I found.

Understanding Snapshots in VMware ESX This article is great!
Committing snapshots when there are no snapshot entries in the snapshot manager
No more space for the redo log error when attempting to start a virtual machine
Troubleshooting a datastore or VMFS volume that is full or near capacity

Keep in mind, there are links in each of these documents to others.  Follow the breadcrumbs!

So, how do we fix the issue?  Read On!

So, if you read the links above, you should understand the basics of snapshots, including naming conventions and how they work.

I am going to predicate all of the information from here forward as what I did to fix my issue.  None of this may work for you.  Even worse, could make your situation worse. If in doubt, use that maintenance you pay for and open a ticket with VMware.

I’ll be using the same example server, WEBSERVER-US, for this.  For this part of the example, the server has had 3 snapshots and will not boot up.

The first step is knowing the order of the snapshots.  If you get this wrong, you will corrupt your data.  (you do have a backup, right?)  If the files have not been messed with, you can probably look at the file dates to determine what order to put them in.

Let’s get a detailed listing of the virtual disks that are in the server’s folder.

ls -l *.vmdk

Your output should look something like this:

-rw——- 1 root root 11687231211 Aug  1   1:10  WEBSERVER-US-000003-delta.vmdk
-rw——- 1 root root   3657651200 May 1  11:10 WEBSERVER-US-000002-delta.vmdk
-rw——- 1 root root        87871200 Feb 1   11:10 WEBSERVER-US-000001-delta.vmdk
-rw——- 1 root root 53687091200 Jan 1  11:10 WEBSERVER-US-flat.vmdk
-rw——- 1 root root                       592 Aug 2 16:46 WEBSERVER-US.vmdk

The WEBSERVER-US.vmdk is the descriptor file created in the previous article.  You should notice the absence of the descriptor files for the delta virtual disks.

So we need to create descriptor files for the delta files.  I will say that it is helpful to have a copy of a descriptor file from an existing snapshot to reference.  For the benefit of this article, we’re going to assume you don’t have one.

A critical error I made was assuming that the descriptor file needed to reflect the actual size of the snapshot file.  WRONG! The descriptor needs to reflect the size of the original file.  So for each of the descriptors you create, you will be using the size of the “flat” virtual disk.  In this case it will be WEBSERVER-US-flat.vmdk.  Looking at my ls -l above, the size of the file is 53687091200.  The key difference in snapshots vs flat files is that they are created as thin disks, and grow on demand.

To create your descriptor (and as a result a virtual disk), run the following command.  DO NOT NAME THE FILE THE SAME AS YOUR EXISTING VIRTUAL DISKS!

vmkfstools -c 53687091200 -d thin -a  temp1.vmdk

The command will create two files: temp1-flat.vmdk and temp1.vmdk.  You can delete temp1-flat.vmdk to save space.

 

Leave a Reply