Backup Deduplication Restore Virtual Tape

Keeping it Factual

I periodically peruse the blogosphere looking for interesting articles on storage, data protection and deduplication. As you can imagine, blog content varies from highly product centric (usually from vendors) to product agnostic (usually from analysts). I recently ran across a post over at the Data Domain blog, Dedupe Matters. This is a corporate blog where it appears that the content is carefully crafted by the PR team and is updated infrequently. Personally, I find canned blogs like this boring. That said, I wanted to respond to a post entitled “Keeping it Real” by Brian Biles, VP of Product Management. As usual, I will be quoting the original article.

A year or more later, Data Domain is scaling as promised, but the bolt-ons are struggling to meet expectations in robustness and economic impact.

When the author refers to Data Domain scalability, he is missing a major point. To this day, the company’s biggest and fastest single appliance scales to 380 MB/sec. Any large enterprise customer that looks at the solution will laugh because it would require them to purchase and manage so many units. (Note that I exclude the DDX which is essentially a screen scraped GUI for multiple units. It is not a real appliance since it still includes 16 separate devices with separate performance metrics and deduplication domains.) I am sure that the author is referring to the scalability required in SMB/SME markets which has been Data Domain’s historic strength.

Finally, Data Domain’s strategy for solving the performance bottleneck above will be a clustered solution of multiple boxes. Which brings the ultimate irony, Data Domain needs a bolt-on solution to meet enterprise requirements! Their solutions are just not designed for enterprise performance or capacity and so they have to revert to bolt-ons to overcome their own limitations.

A post-process system will fill up much faster than expected and/or get controller-bound and slow if it gets too far behind……Data Domain systems are easy: with inline dedupe, the only throughput is dedupe throughput.

He is making generic assumptions about the speed of post-process deduplication in the first sentence. This gross generalization certainly does not apply to SEPATON’s DeltaStor software, and I doubt that it applies to most other post-process solutions either. Perhaps he is really referring to an individual vendor’s solutions (Quantum?) If so, why not just say so instead of making a bogus generalization?

The second point is a classic DD misdirection. They love to point out that with inline dedupe you only need to worry about one throughput measure. I respectfully disagree. What really matters is protecting the data. If it takes you 21 hours to backup and dedupe with a Data Domain solution and 2 hours to backup to an S2100-ES2 and 5 to dedupe it, which approach is safer for your data? For an SMB with limited requirements inline is simple; however, in an enterprise environment with large backups, there is little benefit to inline processing. Data Domain typically promotes how they think inline is so much better, but what really matters is data protection and restoration. You must get your data protected as rapidly as possible to ensure a secure copy is available for immediate restore.

As expected, Data Domain’s post is highly centered on their products and their view of the world. In this post, I have highlighted some of the areas that are misleading in their post. As always, customers must do the research to ensure that they are choosing the right solution for their environment.

2 replies on “Keeping it Factual”

11111February 3, 2009 1702:40 38eIt seems to me that MDM is a technology fosuecd activity that should be replaced by a Business Architecture approach. The problem is that the MDM is fosuecd towards the data store but not towards the use of the data in the programs, the service interfaces, business rules, the business content and in the user presentation. A single enterprise ontology of business objects that implements a well-defined taxonomy of would then provide the data model for storage and archiving. Phew yup, there are quite a few technology chasms to cross. Today a change in the MDM requires further changes in many other places that have to synchronized in deployment. Anyone surprised that we don’t have agility? SOA is not going to change that Alternatively you could work with the Papyrus WebRepository and get all that for free

Leave a Reply

Your email address will not be published. Required fields are marked *