A little bit off topic – deduplication and primary storage

I am digressing slightly from my usual data protection focus, but I found a recent announcement from Riverbed very interesting. They are developing a deduplication solution for primary storage. As an employee of a vendor of deduplication solutions, I wanted to provide commentary.

First some background, Riverbed makes a family of WAN acceleration appliances that reduce the amount of traffic sent over a WAN using their proprietary compression and deduplication algorithms. SEPATON is a Riverbed partner and our Site2 software has been certified with their Steelhead platform. (A bit of disclosure here, I have worked with many people from Riverbed in the past including the VP of Marketing.)

Riverbed’s announcement is summarized in posts on ByteandSwitch and The Register. In short, they are developing a deduplication solution for primary storage. It will incorporate their existing Steelhead WAN accelerators and another appliance code named “Atlas” which will contain the deduplication metadata. (The Steelhead platform has a small amount of storage for deduplication metadata since little is needed when accelerating WAN traffic. The Atlas provides the metadata storage space required for deduplicating larger amounts of data and additional functionality.) A customer would place the Steelhead/Atlas appliance combination in front of primary storage and these devices would deduplicate/undeduplicate data as it is written/read from the storage platform. This is an interesting approach and brings up a number of questions: