Categories
Deduplication

Deduplication Strategy and Dell/Ocarina

This week, Dell acquired Ocarina, a provider of primary storage deduplication.  The acquisition provides technology that they can integrate with existing storage platforms such as EqualLogic.  However, Dell also sells deduplication technology from EMC/Data Domain, CommVault and Symantec.  Dave West at CommVault suggests that these technologies are complementary, and I agree. However, the announcement raises a significant strategic question – which is a better deduplication strategy, “one size fits all” or “best of breed”?

Deduplication is an important technology in the datacenter and reduces power footprint and cooling requirements.  However, it typically brings a performance trade-off during read or write operations due to the additional processing required to re-hydrate or deduplicate data.  The benefits of the technology are compelling and we have seen multiple large companies promote different deduplication strategies.  Their approaches fall into two broad categories:“best of breed” (BoB) or “one size fits all,” (OSFA) and the choice of approach has a major impact.  Let’s look at each strategy individual.

Categories
Deduplication

Storage pools and why they matter

Today SEPATON announced the addition of Storage Pools to our data protection platform.  The technology marks a major step in the path to data protection lifecycle management, and I am excited about the new functionality and wanted share some brief thoughts.

To summarize, storage pooling allows data to be segmented into discrete pools that do not share deduplication.  Data sent to one pool will only be deduplicated against information in that pool and will not co-mingle with other data.  Additionally, pools provide configuration flexibility by supporting different types of disks with different performance profiles.  Pools also benefit from SEPATON’s DeltaScale architecture which allows for dynamic capacity and performance scalability.  Pools are a no-cost option with our latest software release and customers have the ability to implement them in the way that best meets their business requirements.  Some of the benefits include:

Categories
Backup Deduplication Restore

Data Domain & GDA – Bolt-on to the rescue

One of biggest challenges facing today’s datacenter managers is protecting the vast quantities of data being generated. As volumes have increased, customers have looked for larger and larger backup solutions. Multi-node global deduplication systems have become critical to enable companies to meet business requirements and EMC/Data Domain’s response to these challenges has been “add another box” which is their answer to all capacity or performance scalability questions. It appears that Data Domain has acknowledged that this argument no longer resonates and has reverted to Plan B, bolt-on GDA.

The use of the term “bolt-on” stems from a previous blog post by EMC/Data Domain’s VP of Product Management, Brian Biles. In the entry, he characterizes other deduplication vendors as bolt-on solutions, and the obvious implication is that Data Domain is better because it is not a bolt-on. Few would agree with this assertion, but it is an interesting opinion and I will return to this later.

Categories
Backup Deduplication Replication

Deduplication ratios and their impact on DR cost savings

There is an interesting blog discussion between Dipash Patel from CommVault and W. Curtis Preston from Backup Central and TruthinIT regarding the increasing or decreasing benefits of deduplication ratios. They take different perspectives on the benefits of increasing deduplication ratios and I will highlight their points and add an additional one to consider.

Patel argues that increasing deduplication ratios beyond 10:1 provides only a marginal benefit. He calculates that going from 10:1 to 20:1 results in only a 5% increase in capacity efficiency and suggests that this provides only a marginal benefit. He adds that vendors who suggest that a doubling in deduplication ratios will result in a doubling cost savings are using a “sleight of hand.” He makes an interesting point, but I disagree with his core statement that increasing deduplication ratios beyond 10:1 provides only marginal savings.

Categories
Deduplication

TSM Target Deduplication: You Get What You Pay For

I was recently pondering TSM’s implementation of target deduplication and decided to review ESG’s Lab Validation report on IBM TSM 6.1. There is quite a bit of good information in the paper, and some really interesting data about TSM’s target deduplication.

Before discussing the results, it is important to understand the testing methodology. Enterprise Strategy Group clearly states that the article was based on “hands-on testing [in IBM’s Tucson, AZ labs], audits of IBM test environments, and detailed discussions with IBM TSM experts.” (page 5) This means that IBM installed and configured the environment and allowed ESG to test the systems and review the results. Clearly, IBM engineers are experts in TSM and so you would assume that any systems provided would be optimally configured for performance and deduplication. The results experienced by ESG are likely the best case scenario since the average customer may not have the flexibility (or knowledge) to configure a similar system. This is not a problem, per se, but readers should keep this in mind.

Categories
Deduplication

TSM and Deduplication: 4 Reasons Why TSM Deduplication Ratios Suffer

TSM presents unique deduplication challenges due to its progressive incremental backup strategy and architectural design. This contrasts with the traditional full/incremental model used by competing backup software vendors. The result is that TSM users will see smaller deduplication ratios than their counterparts using NetBackup, NetWorker or Data Protector. This post explores four key reasons why TSM is difficult to deduplicate.

Categories
Deduplication Virtual Tape

The Demise of the NearStore VTL: A historical perspective

Rumors have been circulating for months about the demise of NetApp’s VTL offering. Today, Beth Pariseau from SearchDataBackup published the first public confirmation that development on the product has ceased. It is not a surprise, but makes for an interesting case study.

NetApp acquired VTL technology with their purchase of Alacritus for $11 million back in 2005. Alacritus provided a software only VTL solution that ran on a Linux platform. Their product specifications appeared impressive, but they had limited success in the US. Our partners in Asia saw them more frequently. For NetApp, the acquisition made sense because it represented a relatively cost-effective entry into the rapidly growing VTL market. However, as in most things, the difficulties were in the details.

NetApp’s core intellectual property is their ONTAP operating system and associated WAFL filesystem. These components provide the intelligence and value-added features of their arrays. The challenge for NetApp after acquiring Alacritus was the integration of the two technologies.

Categories
Deduplication

Four Must Ask Questions About Metadata and Deduplication

When backing up data to a deduplication system, two types of data are generated. The first comprises objects being protected such as the Word documents, databases or Exchange message stores. These files will be deduplicated and for simplicity I will call this “object storage”. The second type of data generated is metadata. This is information that is used by the deduplication software to recognize redundancies and potentially re-hydrate data in the case of restoration. These two types of data are critical and are typically required for writing the data to the system and potentially reading data. Here are four key questions that you should ask about protecting metadata.

Categories
Deduplication

Bye, bye EDL/DL3D 1500/3000, it was nice knowing you

The email below appeared in my inbox yesterday.  The EDL/DL3D 1500/3000 has officially been discontinued.  It was obvious from the moment EMC purchased Data Domain that the Quantum stuff was dead, but it took time for EMC to finally admit this.  The strongest statement came in Frank Slootman’s TechTarget interview.  Clearly the EMC/QTM relationship was a rocky one from the beginning and so the outcome is not surprising.

Categories
Deduplication Marketing

Data Domain keynote at SNW – Slootman’s surprising response

I attended multiple keynote and breakout sessions at SNW last week, but my busy meeting schedule conflicted with many of the morning sessions. I was able to attend to Data Domain’s talk given by Frank Slootman and wanted to provide some commentary.

The bulk of the session was boring and included what appeared to be a standard corporate slide deck which I am sure any salesperson could present in their sleep.  The presentation could be summarized with Data Domain’s usual message: inline deduplication is good and everything else is bad, and, of course, Data Domain’s deduplication is the best.  I was definitely hoping for something more interesting and was sorely disappointed; however, things changed when it came to the Q&A.

Just to provide a bit of background, my experience with SNW is described here.  There were a large number of end users in attendance both at the expo and the keynote sessions and I estimate that many of the show’s 900 end users were in attendance for this talk.  At the end of the planned remarks, Slootman opened the floor to questions.