CA recently announced the addition of deduplication to ARCserve. Every time an ISV releases deduplication technology, I get inundated with questions about hardware (e.g. appliance-based) vs software (e.g. software-only where separate hardware is required) deduplication. In this post, I will discuss the difference between these two models when using target-based deduplication. (e.g. deduplication happens at the media server or virtual tape appliance.) Client-based deduplication (e.g. deduplication happens at the client) is another option offered by some vendors and will be covered in another post.
Most backup software ISVs offer target-based deduplication in one form or another. In some cases, it is an extra application like PureDisk from Symantec and in other cases it is a plugin like CommVault, ITSM or the new ARCserve release. In all cases, it is packaged as a software option and does not include server or storage infrastructure. Contrast this with appliance-based solution like those from SEPATON that include hardware and storage.
When considering software-only solutions, the key question is the cost of ownership and management. Certainly using existing storage and/or servers may provide the lowest acquisition cost, but what does it mean for manageability? How do you ensure that you are getting the performance, scalability and reliability that you need? What happens when your environment grows and you need to add more capacity and/or performance? Typically, these questions are left to the end user and can be complex and costly to resolve.
Software-based solutions have an important place in the market. They work best in smaller locations with slower data growth and less stringent performance requirements. These end users typically have simpler environments and upgrade their backup hardware less frequently. They are less concerned about scalability and most focused on price. Software only solutions are ideal because they can be implemented cost effectively and can provide immediate disk savings through deduplication.
In contrast, larger environments have a more difficult time with the pure software approach. These data centers have more complex heterogeneous hardware and software. They also typically have backup environments that experience rapid growth. Implementing a software-only solution can be difficult since it will often involve implementing and managing multiple separate hardware and software products. The problem will get worse over time as the environment grows, forcing the customer to implement and manage even more hardware/software. Appliances like SEPATON’s S2100-ES2 benefits these environments since it automates provisioning, managing, and performance optimization.
In summary, pure software-based deduplication solutions are ideal for smaller environments. They can benefit from the savings in $/GB while minimizing the cost and complexity of management and growth. Large enterprises will have a more difficult time with software-only solutions since they introduce a substantial additional management burden, which will reduce cost savings. These larger companies are better served to pursue an appliance approach which provides the simplicity they need to manage hundreds of terabytes.