I recently blogged with my thoughts about EMC acquiring Data Domain, and wanted follow-up with a post discussing some key points about a NetApp/Data Domain merger. Since that last post there have been numerous changes including EMC suggesting that they might up their offer; the inevitable threat of a class action lawsuit, Data Domain endorsing the second NetApp offer and the government initiating an antitrust review. In this context I want to dissect some key points to consider regarding this acquisition.
Reuters indicates that EMC will up its bid for Data Domain to as much as $35 per share. As previously posted, Data Domain’s products will fit easily into EMC’s product line replacing EMC’s current Quantum-based appliances. With this increased offer, EMC is increasing the pressure on NetApp and reaffirming their commitment to acquire Data Domain.
What does this mean?
I was surprised when NetApp offered $1.5B for Data Domain and was even more surprised when EMC countered with an all cash offer of $1.8B. NetApp has since upped their offer to $1.9B of cash and stock. It is in the context of this uncertainty that I wanted to comment on a possible EMC/Data Domain acquisition.
What about EMC’s DL3D product line?
EMC sells target deduplication solutions (DL3D product line) through a partnership with Quantum. These products compete directly with those from Data Domain and rely on similar technology. (Data Domain disclosed that it had licensed Quantum’s deduplication patents in their own IPO documents.) Even though EMC strengthened their commitment to Quantum by providing a $100 million loan back in March, the Data Domain announcement raises serious questions about EMC’s commitment to Quantum. If Quantum’s technology was really good, then why bid almost $2B for a competing technology especially since they could buy Quantum for less than half of this amount.
Some have suggested that EMC is bidding on Data Domain because they want to hurt NetApp. This is certainly a possibility. However, EMC provided a very strong counter-offer and has to recognize that they may own Data Domain in the end.
NetApp’s initial bid for Data Domain came as a surprise to many. EMC’s counter was even more of a shock. These discussions have very important implications for data protection and deduplication. Two thoughts immediately come to mind:
It’s hard to do deduplication well.
EMC and NetApp say that they have robust deduplication solutions in their DL3D (Quantum technology) and NearStore VTL series products. Before these negotiations, you might have believed them. Now, they are both bidding aggressively on Data Domain. What does that say about their confidence in their own solutions? Remember, these are large companies with hundreds (thousands?) of engineers with storage experience. Why wouldn’t they just build their own deduplication technology? The simple answer is that developing really good, enterprise-class deduplication technology is difficult.
Scott from EMC posted about the EMC DL3D 4000 today. He was responding to some questions by W. Curtis Preston regarding the product and GA. I am not going to go into detail about the post, but wanted to clarify one point. He says:
Restores from this [undeduplicated data] pool can be accomplished at up to 1,600 MB/s. Far faster than pretty much any other solution available today, from anybody. At 6 TB an hour, that is certainly much faster than any deduplication solution.
(Text in brackets added by me for clarification)
As recently discussed in this post, SEPATON restores data at up to 3,000 MB/sec (11.0 TB/hr) both with deduplication enabled and disabled. Scott insinuates that only EMC is capable of the performance he mentions and I wanted to clarify for the record that SEPATON is almost twice as fast as the fastest EMC system.
Scott from EMC has challenged SEPATON’s advertised performance for backup, deduplication, and restore. As industry analyst, W. Curtis Preston so succinctly put it, “do you really want to start a ‘we have better performance than you’ blog war with one of the products that has clustered dedupe?” However, I wanted to clarify the situation in this post.
Let me answer the questions specifically:
1. The performance data you refer to with the link in his post three words in is both four months old, and actually no data at all.
SEPATON customers want to know how much data they can backup and deduplicate in a given day. That is what is important in a real life usage of the product. The answer is 25 TB per day per node. If a customer has five nodes and a twenty-four hour day, that’s 125 TB of data backed up and deduplicated. This information has been true and accurate for four months and is still true today.
This article on Byteandswitch.com highlights enhancements to FalconStor’s SIR deduplication platform, but I have to wonder whether anyone cares. FalconStor was a big player in providing VTL software to OEMs; but their deduplication software has been largely ignored.
FalconStor had their heyday in VTL. They aggressively pursued OEM deals with large vendors including EMC, IBM, and Sun. EMC was the most successful with their EDL family of products. As the market moved to deduplication, you would think that FalconStor would be the default OEM supplier of deduplication software as well. You would be wrong.
Ironically, FalconStor’s VTL success was their downfall in deduplication. Their OEMs realized that they were all selling the same VTL software and did not want to repeat the situation with deduplication. EMC and IBM, have already announced that they are using alternative deduplication providers.
I have fond memories from my childhood of Rube Goldberg contraptions. I was always amazed at how he would creatively use common elements to implement these crazy machines. By using every day items for complicated contraptions, he made even the simplest process look incredibly complex and difficult. But that was the beauty of it, no one would ever use the devices in practice, but it was the whimsical and complex nature of his drawings that made them so fun to look it.
Image courtesy of rubegoldberg.com
It is the in the context of Rube Goldberg that I find myself thinking about the EMC DL3D 4000 virtual tape library. Like, Goldberg, EMC has taken an approach to VTL and deduplication that revolves around adding complexity to what should be a relatively simple process. Unfortunately, I don’t think that customers will treat the solution with the same whimsical and fun perspective as they did with Goldberg’s machines.
You may think that this is just sour grapes from an EMC competitor, but I am not the only one questioning the approach. Many industry analysts and backup administrators are confused and left scratching their heads just like this author. Why the confusion? Let me explain.
There is an interesting discussion on The Backup Blog related to deduplication and EMC’s DL3D. The conversation relates to performance and the two participants are W. Curtis Preston the author of the Mr. Backup Blog and the The Backup Blog’s author, Scott from EMC. Here are some excerpts that I find particularly interesting with my commentary included. (Note that I am directly quoting Scott below.)
VTL performance is 2,200 MB/sec native. We can actually do a fair bit better than that…. 1,600 MB/sec with hardware compression enabled (and most people do enable it for capacity benefits.)
The 2200 MB/sec is not new, it is what EMC specifies on their datasheet; however, it is interesting that performance declines with hardware compression. The hardware compression card must be a performance bottleneck. Is the reduction in performance of 28% meaningful? It depends on the environment and is certainly worth noting especially for datacenters where backup and restore performance are the primary concern.
Scott from EMC and author of the backup blog responded to my previous post on DeltaStor. First, thank you for welcoming me to the world of blogdom. This blog is brand new and it is always interesting to engage in educated debate.
I do not want this to go down a “mine is better than yours” route. That just becomes annoying and can lead to a fight that benefits no one. I am particularly concerned since Scott, judging by his picture on EMC’s site, looks much tougher than me! 🙂
The discussion really came down to a few points. For the sake of simplicity I will quote him directly.
So, putting hyperbole aside, the support situation (and just as importantly, the mandate to test every one of those configurations) is a pretty heavy burden.
DeltaStor takes a different approach to deduplication than hash-based solutions like EMC/Quantum and Data Domain. It requires SEPATON to do some additional testing for different applications and modules. The real question under debate is how much additional work. In his first post, Scott characterized this as being entirely unmanageable. (My words, not his.) I continue to disagree with this assessment. Like most things, the Pareto Principle applies here (Otherwise known as the 80-20 rule.). Will we support every possible combination of every application, maybe not. Will we support the applications and environments that our customers and prospects use? Absolutely.