Scott over at EMC recently posted his thoughts about deduplication ratios and how they vary widely. I agree with his assessment that compression ratios, change rates and retention are key ingredients in deduplication ratios. However, he makes a global statement, “If you don’t know those three things, you simply cannot state a deduplication ratio with any level of honesty….It is impossible”, and uses this point to suggest that SEPATON’s Exchange guarantee program is “ridiculous”. Obviously the blogger, being an EMC employee, brings his own perspectives as do I, a SEPATON employee. Let’s dig into this a bit more.
As the original author mentioned, the key metrics for deduplication include compression, change rate and retention. Clearly these can vary by data types; however, certain data types provide more consistent deduplication results. As you can imagine, these are applications that are backed up fully every night, have fixed data structures and relatively low data change rates. Some examples include Exchange, Oracle, VMware and others.
In the case of the guarantee, we focused on Exchange. The original author suggested that every Exchange environment is so radically different that it is impossible to guarantee anything. I disagree. Suggesting that there is no commonality in protecting Exchange data is crazy; just like suggesting that every Exchange implementation is exactly the same. In practice, the Exchange data protection strategies vary less than you might think. Most customers perform perform full backups of the information store nightly. They also have the option of mailbox level backups, but the slow performance of this backup schema limits its practicality. The biggest variance in Exchange backups relates to backup retention and this is directly addressed in the guarantee program.
The SEPATON guarantee program focuses on Exchange because of the points mentioned above and our real world customer experiences. Of course, there are some situations where the guarantee will not apply. However, our program is unique; we are willing to stand by our guarantee and will provide the end user with a free hardware upgrade if the ratio is not realized. Contrary to the blogger’s analysis, our experience has been very positive and we have had no problems achieving the stated ratio.
The original author and I have philosophical differences which relate to our perspectives. He believes that you can never guarantee any deduplication ratio since all data is so different. Perhaps this perspective is skewed by limitations in Quantum’s technology? At SEPATON our real world experience has shown that it is entirely reasonable to set or even guarantee deduplication ratios for specific data types. In the end the market will decide, the EMC blogger states that we are crazy while our real world experience suggests otherwise. You decide which is more credible.