I was recently pondering TSM’s implementation of target deduplication and decided to review ESG’s Lab Validation report on IBM TSM 6.1. There is quite a bit of good information in the paper, and some really interesting data about TSM’s target deduplication.
Before discussing the results, it is important to understand the testing methodology. Enterprise Strategy Group clearly states that the article was based on “hands-on testing [in IBM's Tucson, AZ labs], audits of IBM test environments, and detailed discussions with IBM TSM experts.” (page 5) This means that IBM installed and configured the environment and allowed ESG to test the systems and review the results. Clearly, IBM engineers are experts in TSM and so you would assume that any systems provided would be optimally configured for performance and deduplication. The results experienced by ESG are likely the best case scenario since the average customer may not have the flexibility (or knowledge) to configure a similar system. This is not a problem, per se, but readers should keep this in mind.
The whitepaper highlights the data reduction realized using TSM and mentions capacity savings of 19:1. However, if you look carefully, you see that the space savings calculations are based on capacity reduction from TSM’s proprietary progressive incremental technology and deduplication. Progressive incremental technology reduces the amount of storage required by bypassing full backups. The really interesting question is “what additional benefits are gained by using TSM’s target deduplication?” Fortunately, ESG provides an answer.
The idea behind deduplication is that it provides capacity savings by removing redundancies within backup data. Theoretically, IBM should have an advantage deduplicating TSM backups since they are intimately familiar with the application and its data formats. However, ESG’s results do not support this assertion. The paper states, “Data deduplication enhanced data reduction in TSM nearly 50% over progressively incremental backup schemes alone.” (page 8 ) This suggests that IBM’s deduplication provides a 2:1 space savings! Wow, talk about a minimal benefit; you could get close to the same results with hardware compression.
In summary, TSM deduplication appears to provide minimal capacity savings while creating management challenges. As an end user concerned about backup and recovery, you should carefully evaluate your options. The improved manageability, performance and data reduction of dedicated target deduplication appliances like SEPATON’s S2100 product family is a better option for all but the smallest environments. TSM includes target deduplication for free, but remember, in this case, you get what you pay for!