A while back i wrote a post about data deduplication in 2012…. generally a very good feature, but as that specific post talked about, there was a collection of .iso and compressed data where not only did dedup not save me anything, it actually used up more space in the dedup folder than the orginal data size (which i found a little odd)
Today i got around to doing something about this and found
- Disabling data deduplication (via GUI or powershell) only stops further deduplication from occuring – but data that has already been deduplicated will remain deduplicated
- In order to “move” (re-hydrate ?) the data back to the original files and out of the deduplication store, use the powershell command start-dedupjob -Volume <VolumeLetter> -Type Unoptimization
- You can check the status on where this is at by using get-dedupjob, or, i like using TreeSize which shows the size on disk of specific files…. including the deduplication chunks
- At this stage – i noticed the original files getting bigger, but the dedup store (and the chunks within it) have not decreased at all…. “maybe theres another command for this ?” i thought….
- There were two additional job types available, “garbageCollection” and “scrubbing”. Unfortunately the powershell help nor the technet documentation actually state what either of these do! After a bit of searching, i found this page http://www.infotechguyz.com/WindowsServer2012/DedupandWindowsServer2012.html which specifies that GarbageCollection will find and remove unreferenced chunks and scrubbing will perform an integrity check…. so with this knowledge i then ran
- start-dedupjob -Volume <VolumeLetter> -Type GarbageCollection only to find that this command can only be run when dedup is enabled!
- In order to get around this, i re-enabled dedup, but excluded all folders on the drive, i also removed all the schedules/background optimisation settings…. then re-ran the command
- Ininitally the size of the dedup folder increased by approx 100mb (keep in mind the dedup folder a thtis stage was 2.2TB), but soon the get-dedupjob status seemed to stop at 50% and the size of the dedup folder started coming down, quite quickly in 1GB chunks (the chunks seem to be a max of 1GB)
- Once this completed (it took a while) – i disabled dedup again and all was good
Just to be clear, 2012 deduplication is still a good technology – and i use it elsewhere with great results – just every now and again, you will run into a dataset which it just does not agree with…. and disbaling it completely just isn’t intuitive…. (and yes, all this probably could have been avoided by running the dedup estimator tool – but then i wouldnt have learnt stuff – so theres no fun in that!) hence why I thought it would write the above…. hope it helps someone.
Thanks a lot for this post, I’ve just had to do this with a volume in 2012 R2 and would have been scratching my head without the Unoptimization command.
Reblogged this on The Lonely DBA and commented:
ok, it is always good to know how to turn off things. So here it is: how to reclaim space on deduplicated drives.
Is data still accessible during the rehydration? Would hate to start the job and cut our engineers off for a few hrs
it was for me…. however mine was test data – and the disk was not under any other load…..
While the process doesn’t take the data offline, purely from a performance point of view – i would suggest performing any “re-hydration” outside of business hours.
Thank you – been having a tough time figuring out the same issue you had
No worries mate, glad it helped
hi found this article after losing 2 tb worth to the system information folder still have 900 gb in the “chunk store” any ideas how i can recover this?
should meation this is on a win 8.1 machine running as just a media store. i also have 30gb worth in the dedup logs can i safely remove these?
Hi james,
im not sure we are talking the same things here…. this article is about server 2012 deduplication, a feature not available on Windows 8.1… while I have seen some hacks to get it working on windows 8.1, not only is it not supported, but I wouldn’t “trust” it….
Next up, im not quite clear on what you mean by “losing 2 tb worth to the system information folder” ?
Thanks, the GarbageCollection job type saved my life!