Data Domain recently announced that their new OS release dramatically improved appliance performance. On the surface, the announcement seems compelling, but upon further review, it creates a number of questions.
Deduplication software such as Data Domain’s is complex and can contain hundreds of thousands of interrelated lines of code. As products mature, companies will fine tune and improve their code for greater efficiency and performance. You would expect to see performance improvements from these changes of about 20-30%. Clearly, if an application is highly inefficiently coded, you will see greater performance gains. However, larger improvements like those quoted in the release are usually only achieved with major product architecture updates and coincide with a major new software release.
In this case, I am not suggesting that Data Domain’s software is bad, but rather that the stated performance improvement is suspect. They positioned this as a dot code release and so it is not a major product re-architecture. Additionally, if it was a major architecture update, they would have highlighted it in the release.
To summarize, the stated performance gains in the release are too large to attribute to a simple code tweak and I believe that the gains are only attainable in very specific circumstances. Data Domain appears to have optimized their appliances for Symantec’s OST and is trumpeting their performance gains. However, OST represents only a small fraction of Data Domain’s customer base and it seems that customers using non-Symantec backup apps will see uncertain performance improvements. Read on to learn more.
Symantec Open Storage Technology
The announcement focuses exclusively on Symantec OST performance. For those who are not familiar with OST, it is a disk-only backup interface developed by Symantec for NetBackup and Backup Exec. The technology has yet to be supported by other backup applications and is essentially proprietary to Symantec.
Data Domain chose to highlight their performance with the OST for a reason. If the gain was with all interfaces, why not just characterize it as such? Remember, 70%+ of their customers use the NFS/CIFS interface and so why not benchmark that? This text further highlights my point:
While performance can increase on all systems across all protocols, large data centers using Data Domain’s flagship DD690 system with Veritas NetBackup OpenStorage (OST) by Symantec….
(Bold added by this author)
In this sentence they qualify the stated performance gains and suggest that it “can” increase with other systems and protocols. What happened to the 90% improvement they stated with OST? It appears that they are willing to commit to it with OST, and not with other interfaces. This creates the following points for consideration:
- Is there something about OST specifically that enables this faster performance?
- What does the software upgrade mean to customers using interfaces other than OST? (e.g. traditional NFS/CIFS and VTL)
- What is the impact on other backup applications that do not support OST?
In another sentence, they allude to the performance gains being related to OST.
This new benchmark was established with the minimum DD690 configuration of 2 disk shelves and benefits from extra tuning and parallelism possible with this combination of fabric and software.
(Bold added by this author)
It is important to note that the paragraph starts by discussing OST and so the assumption here is that they are referring to OST when they say “software”. They suggest that the tuning and parallelism of OST combined with 10 GigE switch infrastructure are key drivers of the performance gains.