Backup Deduplication Replication

Deduplication ratios and their impact on DR cost savings

There is an interesting blog discussion between Dipash Patel from CommVault and W. Curtis Preston from Backup Central and TruthinIT regarding the increasing or decreasing benefits of deduplication ratios. They take different perspectives on the benefits of increasing deduplication ratios and I will highlight their points and add an additional one to consider.

Patel argues that increasing deduplication ratios beyond 10:1 provides only a marginal benefit. He calculates that going from 10:1 to 20:1 results in only a 5% increase in capacity efficiency and suggests that this provides only a marginal benefit. He adds that vendors who suggest that a doubling in deduplication ratios will result in a doubling cost savings are using a “sleight of hand.” He makes an interesting point, but I disagree with his core statement that increasing deduplication ratios beyond 10:1 provides only marginal savings.


Global Deduplication Explained

W. Curtis Preston recently authored an article on explaining global deduplication.  This is an important topic which frequently causes confusion.  Curtis does a good job explaining the technology and what it means to end users and  I recommend the article.

A quick summary is that global deduplication means that a common deduplication repository is shared by multiple nodes in a system.  In these environments, a customer can backup their data to any node on a system and it will be deduplicated against related data.  This provides improved ease of use and scalability.

Deduplication Restore

CommVault and Forward Referencing

I was recently reading this document from CommVault that highlights their deduplication technology and was surprised by their use of the term “forward referencing”. Forward referencing is a common term in deduplication with a generally agreed upon definition. CommVault appears to have redefined the word and promoted their version as a feature.  This is confusing and possibly misleading because a reader might not realize that the definition of “forward referencing” in this document is completely different from the one  everywhere else in the industry.

Deduplication Restore

Defragmentation, rehydration and deduplication

W. Curtis Preston recently blogged about The Rehydration Myth. In his post he discusses how restore performance on deduplicated data declines because of the method used to reassemble the fragmented deduplicated data on disk. He also addresses the ways various technologies attempt to overcome these issues, including disk caching, forward referencing (used by SEPATON’s DeltaStor technology) and built-in defrag. In this post I wanted to discuss the last option because it is a widely-used approach for inline deduplication that has some little-known pitfalls.

Backup Deduplication Restore

W. Curtis Preston on physical tape

W. Curtis Preston recently wrote an article on the state of physical tape for SearchDataBackup. He talks about the technologies that backup software vendors have created technology to more effectively stream tape drives. As I posted before, if you cannot stream your tape drives, their performance will decline dramatically.

In enterprise environments, performance is the key driver of data protection. You must ensure that you can backup and recover massive amounts of data in prescribed windows, and tape’s inconsistent performance and complex manageability makes it difficult as a primary backup target. This fact can also make tape a challenging solution in small environments.

The problem with tape drive streaming is a common one and Preston agrees that it is the key reason for adopting disk-based backup technologies. Our customers typically see a dramatic improvement in performance with SEPATON’s VTL solutions since they are no longer limited by the streaming requirements of tape.

Even with new disk and deduplication technologies, most customers are still using tape today and will do so into the future. However, tape will likely be used more for archiving than for secondary storage.  Deduplication enables longer retention, but most customers are probably not going to retain more than a year online. Tape is a good medium for deep archive where you store data for years, but is complex and costly as a target for enterprise backup.

Deduplication Restore

Restore Performance

Scott from EMC posted about the EMC DL3D 4000 today. He was responding to some questions by W. Curtis Preston regarding the product and GA. I am not going to go into detail about the post, but wanted to clarify one point. He says:

Restores from this [undeduplicated data] pool can be accomplished at up to 1,600 MB/s. Far faster than pretty much any other solution available today, from anybody. At 6 TB an hour, that is certainly much faster than any deduplication solution.
(Text in brackets added by me for clarification)

As recently discussed in this post, SEPATON restores data at up to 3,000 MB/sec (11.0 TB/hr) both with deduplication enabled and disabled. Scott insinuates that only EMC is capable of the performance he mentions and I wanted to clarify for the record that SEPATON is almost twice as fast as the fastest EMC system.

General Marketing

W. Curtis Preston Now with TechTarget

About a week ago, Curtis posted on his blog that he is joining TechTarget as an Executive Editor which essentially means that he will continue to present at various events. He is still an independent consultant and can keep working on his other projects including his Mr. Backup Blog and BackupCentral.

In my opinion, this is a great outcome for both TechTarget and Curtis. The Backup/Deduplication schools will benefit from Curtis’s continued tenure as a featured speaker. He is an engaging presenter and provides a balanced perspective. It is also beneficial for Curtis because he is free to pursue his personal and business interests.

A big congratulations to both TechTarget and Curtis!

General recognition

W. Curtis Preston the author of the Mr. Backup Blog recently posted an article about the blogs that he frequents. I was honored that he recognized along with blogs from other major vendors.

Curtis mentioned his frustration with the comment filtering policies on some blogs and I wanted to clarify’s policy. (A synopsis of the policy is contained in the disclaimer in the sidebar.) Comments are not moderated; whatever you post appears on the site instantly. I have little interest in censorship; however, I reserve the right to delete comments containing abusive or personal attacks. I hope I never have to use my power of deletion, but as Uncle Ben said to Peter Parker/Spiderman:

With great power comes great responsibility.

Now back to regularly scheduled programming…..