2012 deduplication savings

So now that 2012 has been out for a while… im starting to see a bit more detail on certains features… in this case file de-duplication.

I applied file deduplication to a drive we use called our “SOE” drive, which contains all the MS products, pre-prepared packages for SCCM, VHD (and vhdx) libraries etc…. a reported dedup hit rate of 42%, but on the folder, showing 8.5GB on disk out of 357GB in the folder – quite impressive…. but also not 42%, so where’s the gap?

So, when i ran out of NAS space, i purchased a stand-alone 3TB drive in a server – for storage of “less important” data, that has already been compressed using a combination of 7zip and rar…

As deduplication doesnt work on compressed files, i started decompressing – to see which is better, compression or deduplication. (I realise this depends on file types etc, but this data was a mix of iso’s, documents, executables etc)

I had used approx 2.2TB out of 2.72TB available while the files were compressed.

Once many of the folders were decompressed, i was, according to the properties of the folder, storing 2.16TB of data and only using 151mb on disk….. thats quite impressive, but the properties of the disk told a different story – 7oGB free….

so where has all the space gone?

Utilising one of my favourite tools – treesize by jam software, i had a quick look…. under “System volume informationDeDup” there was now 2.6TB of data… thats right, the dedup folder was larger than the original data itself!

Going back to my SOE drive, i discovered the deup folder was 208GB…. so my real “saving” on the SOE drive was only around the 148GB mark, not the 340-ish Gb the folder properties showed – still a good saving – but unfortunately no where near what that interface claims… but in line with the 42% presented within server manager.

Anyhoo – to sum up….. in order to see the real savings file dedup is providing, have a look in server manager, or have a look at your  “System volume informationDeDup” folder on the volume… and understand that for some volumes (depending on file type), there might not only be no savings, but it may increase your disk usage.

Dedup is something that is cool to get “for free” (i imagine most people reading this blog will have EA’s with software assurance) and will be quite useful in a number of situations…. but, as usual, the Microsoft sales machine is making it sound like the coolest thing that has ever happened (and claiming ludicrous savings – http://blogs.technet.com/b/filecab/archive/2012/05/21/introduction-to-data-deduplication-in-windows-server-2012.aspx)- so in the words of public enemy, don’t believe the hype.