Deduplication 2.0

The folks over at the Online Storage Optimization blog recently wrote a post entitled Get Ready for Dedupe 2.0 where they outline their vision for the future of deduplication.  I read the post and was amazed at the similarity between their views and SEPATON’s core VTL architecture. I thought that it would be useful to address each of their points and indicate how they apply to SEPATON’s DeltaScale Architecture.

Global dedupe
All of the nodes in a SEPATON system share a common deduplication repository.  No matter which node and/or port data is backed up to, SEPATON will deduplicate it against the related data set.

SEPATON uses a concurrent process approach to deduplication.  Our VTLs write to disk first and then deduplicate data.  We process multiple backup jobs concurrently in a way that does not interfere with backup performance.  The approach combines the post-process benefit of wirespeed ingest and the inline benefit of rapid deduplication completion.

Scale-out processing
SEPATON’s DeltaScale architecture is designed around the concept of a multi-node grid.  All of the nodes in the grid share a common deduplication repository and customers can dynamically increased I/O and computing power by adding nodes to the grid.  The process is non-disruptive and enables future protection as the customer can grow the system over time.

Scale-out capacity
DeltaScale also provides scale-out capacity. Customer can dynamically add the capacity to the system as their storage and retention needs change. To simplify the process, the appliance automatically provisions and performance optimizes the new storage.  The customer simply has to plugin the new disk shelf, point and click in the GUI and the system handles the process of adding the storage and leveraging the additional spindles for increased performance.  The S2100-ES2 currently scales to 1.6 PB usable excluding deduplication and hardware compression benefits.

The author suggests that these four points are requirements for dedupe 2.0 solutions and I would suggest that by this definition, SEPATON is providing dedupe 2.0 solutions today.

The author also makes the point that Data Domain was the early winner in the deduplication space, and I agree as far as the SMB space is concerned.  Data Domain initially targeted the low end of the market where $/GB was the number one metric and performance and scalability were of secondary importance.  Clearly, they were successful in that space and thanks to NetApp/EMC were able to exit with a huge valuation.

As deduplication matures, we are seeing larger environments implementing the technology.  Data Domain will have a more difficult time competing for these opportunities.  Their single node architecture limits their scalability and performance.  This was never a problem in smaller environments, but can be a huge issue in enterprises where performance and scalability are vital.  These customers recognize that their environments will change over time and demand systems that can dynamically adopt to their requirements in a seamless and unified manner, and SEPATON’s solutions are designed to solve this problem.  Call it dedupe 2.0 or enterprise-class data protection – whatever the name SEPATON is providing the fastest and most scalable deduplication solutions in the industry today.

2 replies on “Deduplication 2.0”

All valid points with one correction. SEPATON’s with DeltaStor enabled do not support non disruptive shelf/storage or SRE additions. It is a disruptive process. This is definitely a characteristic which would be highly desirable to have in the enterprise.

Hi and thank you for your comment,

Regarding the hot add feature it is a feature that we have supported for many years with VTL. As you mentioned, that functionality is not currently supported with DeltaStor and will be addressed in an upcoming release.

Leave a Reply

Your email address will not be published. Required fields are marked *