Computers and Internet

Windows Server 2012 – Storage Spaces and Data Deduplication Hands-on Review Part 5

This is the continuation and Part 5 of my Storage Spaces and Data Deduplication review. Here’s an index of test cases on this part and links to other parts:

  • Part 1: Introduction and Lab Environment Preparation
    • Physical Disk Pull
    • Introduce the Pulled Disk Back into the System
    • Extend Thinly Provisioned Virtual Disk and Observe Behavior at Limits
    • Bonus: Detecting and Replacing Physical Disk Failures
    • Removing a Disk from the Storage Pool
  • Part 5 (You’re here)
    • Reclaim Unused Space on Thin Provisioned Disks
    • Bonus: Defragmentation Attempt and Observations
    • Understanding Hot-Spare Behavior
    • Evaluating and Enabling Data Deduplication

Reclaim Unused Space on Thin Provisioned Disks

I expect footprint reduction being a very important scenario for migrations between systems as large data sets move and gets re-organized. Organizations should not have to provision twice the amount of physical disks just to be able to re-organize their storage spaces. Thankfully there is a way to do this, and I could not figure it out previously. Thanks to Nandu from Storage Spaces team I learnt a few things and I’ll show it to you now:

Here’s my test case:

  1. Have a virtual disk, thinly provisioned. Use 4 physical disks.
  2. Fill it up with actual data with 100% NTFS volume allocation within.
  3. Delete data until one of the disks can be removed from the virtual disk (this was the part that I previously thought was not reducing the allocation of thinly provisioned volume on its own)
  4. Remove the disk.
  5. Show that everything is healthy and functioning as expected.

We’re starting with those 4 disks that I mentioned, here how things look like from virtual disk perspective:

image

I’d like you to keep an eye on “Allocated” column of the virtual disk “TestVirtualDisk1”, which is showing 265GB at the moment.

Because I just added the 4th disk to show you this process, I need to copy some data to it. I will create 3 additional single files, each of which are 45GB or so in size. Yes these are large single files, and are in addition to 265GB that already exists on the virtual disk. I give you all these details because file sizes, total allocations all do matter very much in this test case. For example, deleting a small file may not decrease the allocation, but a larger one might. Keep reading to observe these variances.

Copy operation complete, here’s where we are:

image

…but more importantly the individual disk level allocations:

Remember our purpose: We want to remove Disk3 (92.5GB) from this virtual disk and we want it to still remain healthy. Test scenario for removal is that underlying NTFS file system utilization went down, and we’re wanting to re-purpose some physical disks in other places.

First test I’m going to do is to hard-delete one of those 3 files I have added and review how allocations and disk spaces are changing, if at all:

After this operation, let’s see if anything changed on the Server Manager. As you can see, allocation has dropped to 351GB from 383GB.

image

If you noticed, I have deleted a file 45GB in size. Allocation drop however, was only 383-351=32GB in size. It’s hard to reverse engineer the math here. But the allocation appears to be happening in chunks.

Before I forget, let me include disk level distribution (although it’s very possible this would vary based on which file I have deleted – so don’t read too much into utilization of individual disks)

So let’s try something else. I’m going to delete 8GB file, see what happens then. Theory I’m testing here is if the allocation trimming is happening in chunks, I wonder when I will hit that threshold.

Let’s see what Server Manager shows now.

image

As suspected – no change!

At this point let me introduce two bullet points provided to me by Nandu:

There are two ways in which space reclamation occurs:

  1. The file system sends down TRIMs as soon it has released the allocation. If the TRIM is for a large enough region (the slab granularity, in the spaces case, it is 256MB), the space will be immediately released and will be reflected in the allocatedSize property of the physicaldisks, and the FootPrintOnPool,AllocatedSize properties of the virtual disk.  You can observe this by deleting a single large file.
  2. The file system sends down trims, but no single trim covers an entire slab, the driver is unable to release allocation and nothing changes.

Given that 8GB is quite a large file, I must be hitting the condition 2.

For that condition, we need to use “Optimize-Volume” PowerShell cmdlet with some special parameters. Like this:

In my case however, it didn’t help. Allocation still showing 351GB. Let me delete more files and try this again:

Now the Server Manager is showing…

image

337GB allocated. I have deleted exactly 38GB worth of files. 351GB (previous allocation) – 337GB = 14GB reclaimed. Why the difference?

At this point, we have about 13GB from prior delete operation, and 38-14=24GB from last delete operation, waiting to be reclaimed. Let’s see if optimize-volume works differently this time:

Looks like the number of slabs are too few for optimize-volume to care enough and do something about them.

Defragmentation Attempt and Observations

At this point I will attempt a good old defrag and observe – I have a strange feeling about this. I ran:

As you can see we’re into 9% of the defrag operation (it’s been about 30 mins or so since it started).

Let’s see how my virtual disk allocation is doing (prepare for a surprise)

image

What?! It actually increased. I have not added any new files. Just letting defrag do its thing. If went from 337GB to 345GB. Now the system owes us:

  • 13GB when I deleted a 45GB file and it reclaimed only 32GB of it.
  • 24GB when I deleted 38GB worth of files and it reclaimed only 14GB of it.
  • 8GB of “defrag” overhead only when 9% into defragmentation.
  • Total: 13+24+8 = 45GB system owes us (i.e. we should be able to reclaim)

Let’s highlight this observation: Defragmenting a volume created inside a “thinly provisioned virtual disk” could increase the allocation footprint of the virtual disk.

What I just said is really bizarre and I will research it to figure out what’s going on, or if the defrag-time bloat is temporary.

<waiting till defrag finishes. I’ll note allocation size and defrag % as I check the status occasionally>

Defrag percent

Allocated Size on Virtual Disk

~10%

346GB

~11%

347GB

~12%

349GB

~13%

352GB

~14%

367GB

>>I have interrupted defrag as it’s getting very high while we’re still below 15% completion. At this rate, footprint could reach to 100% of the provisioned capacity. I’ll learn more about this and report back.

Let me re-run the previous optimize-volume command and see what happens:

Not good. Meanwhile physical disks are all still active as if they are continuing to do defrag. I’ll wait a while and let them stabilize a bit.

<I decided to let the system do its thing for a few hours>

Disks have eventually idled and when I re-tried the above command, it behaved better, as in:

Slab Consolidation part is taking a short while, reached about 35% in 5 minutes. Let’s recap what we have so far while waiting:

  • Traditional defrag on a thin-provisioned volume significantly increasing the volume’s actual physical footprint. Pending further research but for now I am inclined to say it’s not a good idea to defrag a volume that sits on a thin provisioned virtual disk.
  • Optimize-volume command could error out if you run it immediately after interrupting a traditional defrag process.
  • Disks continue to be active even after you interrupt the defrag process, still causing optimize-volume to fail. I have not been able to measure exactly how long after interruption of defrag the disks reach idle state.
  • Slab consolidation process has some minimum slab count in its mind, and will refuse to reclaim space if the changes are too small.
  • Only after I reached a significant level of file deletions, optimize-volume decided to actually do slab consolidation.

Ok. While I type these up, slab consolidation finished, here how finishing lines look like:

Alright… As you can see, system owed us 45GB of space from deleted files. It also was showing a slab with potential to be purged but declining to process it based on the assessment of “too few slabs”. Result is within my expectations and within rounding errors that I believe I have been able to reclaim all the space I was expecting to reclaim.

Here’s how the Server Manager looks:

image

Next up is “Hot Spare”. Continue on Part 6.

8 replies »

  1. I have a large folder with many subfolders and files, with over 3.5 million files. about 193 GB. I selected the top folder and I hit delete. then I decided that I want to cancel the delete, then I emptied the recycling bin. Then the folder disappeared from the drive. but 193GB still unaccounted on the drive of Raid 10 7.28 TB. nothing will let me release these “hidden unknown files”.

    • Hmm.. I’m not sure about the interruption of delete action aspect, however, yes, deletions may not release free space immediately depending the circumstances.

      try issuing get-virtualdisk | fl, and also mention your OS version.

  2. I just finished installing it and testing the hard drives when I tried deleting the large folder. I have no virtual drives yet, only physical.
    the OS is Windows Storage 2012 R2 Workgroup Edition.

  3. Here’s the stat on the drive:

    7628270 MB total disk space.
    203696368 KB in 3571368 files.
    832208 KB in 30541 indexes.
    0 KB in bad sectors.
    3950687 KB in use by the system.
    65536 KB occupied by the log file.
    7424677 MB available on disk.

    4096 bytes in each allocation unit.
    1952837375 total allocation units on disk.
    1900717560 allocation units available on disk.

    as you can CHKDSK shows 3571368 files using 200 or so GB, but when selecting all folders on the drive and check properties, you only see about 12.5 GB of files. So where are those files sitting and how can the space be released?

    • I’m not sure how knowledgeable you are on general file system operations. For example, have you considered all of the hidden and system files? Try enabling Folder Options such that it shows all of system and hidden files; then see if you can make sense of what’s going on. Dedupe related disk space issues are typically not visible to chkdsk; since chkdsk is seeing your files, I’m inclined to think they are there, perhaps as hidden or system.

  4. >Traditional defrag on a thin-provisioned volume significantly increasing the volume’s actual physical footprint.
    Combine what you (should) know about defrag and thin provisoning and it is obvious why…
    Defragmenting means moving files taking up parts of different slabs into a new slab, which of course has to be assigned first, so while defragmenting the footprint HAS to increase.

    >Optimize-volume command could error out if you run it immediately after interrupting a traditional defrag process.
    >Disks continue to be active even after you interrupt the defrag process, still causing optimize-volume to fail. I have not been able to measure exactly how long after interruption of defrag the disks reach idle state.
    Just closing the Powershell does NOT abort anything you have commanded the system to do, so the defrag process keeps running, but – missing the linked window – without a report. So trying to consolidate in this state HAS to fail…

    >Slab consolidation process has some minimum slab count in its mind, and will refuse to reclaim space if the changes are too small.
    Yes. And sadly now documentation has mentioned a way to force it even on 1 slab…

    >Only after I reached a significant level of file deletions, optimize-volume decided to actually do slab consolidation.
    And in the process will then release the partially used slabs previously freed up by defrag getting you the end result of the allocated space being (nearly) as big as actual volume usage…

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s