HIFN recently announced a card that accelerates hash-based deduplication. For those unfamiliar with HIFN, they provide infrastructure components that accelerate CPU intensive processes such as compression, encryption and now deduplication. The products are primarily embedded inside appliances, and you may be using one of their products today.
The interesting thing about the HIFN card is that they are positioning it as an all-in-one hash deduplication solution. Here are the key processes that the device performs:
- Hash creation
- Hash database creation and management
- Hash lookups
- Write to disk
It does all of this by providing a Virtual Block Device (VBD) which an application can read/write to. The application does not need to be aware of the deduplication process since the HIFN card performs the task in the background.
When we first heard about this card, it was thought that it would accelerate hash creation which would benefit hash-based deduplication vendors. (e.g. accelerate step 1 above.) Now that the card has been announced, you would think that these vendors would be excited. Well think again, all of these manufacturers focus on software as their differentiator. That is, today’s hash-based dedupe vendors all say that they have a better way to do 1 – 4 above. There is truth to this as each vendor performs these steps slightly differently. (Ironically, Data Domain was forced to cross-license Quantum’s deduplication patents and so what does that say about how unique their approach is?)
With this announcement, HIFN is attempting to commoditize all of these algorithms. They are suggesting that the multitude of hashing flavors is unnecessary and that the best approach is to use a HIFN card to handle the deduplication process. They suggest that you can put the card in a whitebox server and voilà, your own dedupe appliance. If this is really true, it definitely can commoditize hash-based deduplication because now the barriers to create a new appliance are essentially gone.
On the performance side, HIFN says that they support up to four cards in a server and each card provides up to 250 MB/sec performance. Most of today’s fastest hashing appliances run around 400 MB/sec and so a fully populated HIFN solution would show an improvement of over 2x today’s fastest systems. However, it is unclear how these separate cards would work together. Are they clustered to provide one deduplication domain? Are they independent thus providing separate deduplication domains? What happens if one fails? These are all questions that have yet to be answered and are critical to understanding how the technology will work in the real world.
In summary, HIFN has taken a very aggressive approach with their new card. They are trying to take on the existing hash-based dedupe vendors like Data Domain and commoditize the technology. Only time will tell if they are successful; however, the hash-based vendors have reason to be concerned. (Fortunately, SEPATON’s ContentAware approach is well differentiated from hash-based solutions so the HIFN announcement has little impact on us.)