Backup Deduplication Restore

Data Domain & GDA – Bolt-on to the rescue

One of biggest challenges facing today’s datacenter managers is protecting the vast quantities of data being generated. As volumes have increased, customers have looked for larger and larger backup solutions. Multi-node global deduplication systems have become critical to enable companies to meet business requirements and EMC/Data Domain’s response to these challenges has been “add another box” which is their answer to all capacity or performance scalability questions. It appears that Data Domain has acknowledged that this argument no longer resonates and has reverted to Plan B, bolt-on GDA.

The use of the term “bolt-on” stems from a previous blog post by EMC/Data Domain’s VP of Product Management, Brian Biles. In the entry, he characterizes other deduplication vendors as bolt-on solutions, and the obvious implication is that Data Domain is better because it is not a bolt-on. Few would agree with this assertion, but it is an interesting opinion and I will return to this later.

On the surface, the GDA announcement includes all of the buzzwords including “dual controller”, “global deduplication” and “transparent load-balancing”. The solution sounds impressive, right? Well so did the DL3D 4000 and remember what happened to the DL3D family? The simple fact is that EMC/Data Domain (DD) has announced a product that may be impressive on paper, but the reality is much different. DD has been rumored to be developing a true global deduplication solution for the last few years, and it is hard to believe that this is the result.

GDA is unlike any other deduplication appliance. Every vendor (including DD pre-GDA) has built appliances that are self-contained and designed to minimize the administration, configuration and modification of backup environments. In short, the solutions are designed for simplicity. The GDA is different; it is essentially two separate DD880’s and a heavy OST client that hashes and software compresses the data and sends it to one of the two boxes based on the first digits of the hash. By forcing media servers to perform highly CPU-centric processing, the GDA moves the problem to the media server. The problem now rests on the end users shoulders to manage and size every server to ensure it meets appropriate GDA specifications. Essentially, the GDA has assimilated the media servers into its “appliance” realm and pushed processing and management activities onto the customer. It is important to note that GDA requires Symantec OST technology which is available on NetBackup and BackupExec. If you run any other backup application then GDA is not an option for you.

Here are some points to consider related to the implementation:

  • How do you size the required media server upgrades to meet your performance requirements?
  • Will you need to purchase new media servers to gain the processing power required?
  • Who do you call if your performance does not meet expectations?
  • Are you comfortable running such highly CPU intensive agents on your media servers which may already be resource constrained?
  • What happens if you have some applications that backup directly to tape using a LAN free approach such as NDMP or RMAN?
  • What if you are running backup applications like IBM TSM, EMC’s own NetWorker or CommVault that do not support OST?

These are real end user concerns. We have plenty of customers with 20+ media servers who would need to completely overhaul their environment to implement GDA. These customers would scoff at adding DD’s agents to their already overburdened servers.  In contrast, a true multi-node global deduplication solution like SEPATON’s can be connected and will just work as a backup target. Agents are not required when backing up to a VTL.

In summary, it appears that EMC/Data Domain is tacitly acknowledging the limitations of their hash-based architecture. They spent years trying to develop an integrated global deduplication solution and the best they could come up with is a cobbled together system including dual 880’s and an unwieldy OST client. It appears that the GDA has more in common with the DL3D4000 then I originally thought! Isn’t it ironic that the biggest opponent of bolt-on technology has so fully embraced the concept in the GDA?

3 replies on “Data Domain & GDA – Bolt-on to the rescue”

Your assertion that Data Domain’s OST requirement doesn’t support NDMP and RMAN is incorrect. It doesn’t support _LAN-free_backup_ of RMAN or NDMP. To do that you need Fibre Channel and virtual tape drives. They have both, but they’re not supported in the GDA. This is not to say they don’t support NDMP or RMAN. Both can back up over the LAN to a NetBackup media server.

Thank you for your comment. I did not make a global assertion but rather said “applications that backup directly to tape such as NDMP or RMAN.” However, it appears that there is room for confusion and so I further clarified the language.

Leave a Reply

Your email address will not be published. Required fields are marked *