Categories
Deduplication Virtual Tape

NetApp Dedupe: The Worst of Inline and Post-process Deduplication

NetApp finally entered the world of deduplication in data protection. While they have supported a flavor of the technology in their filers since May 2007, they have never launched the technology for their VTL. Why? Because their VTL does not use any of the core filer IP. It relies on an entirely separate software architecture that they acquired from Alacritus. Thus all the features of ONTAP do not apply to their VTL. However, I digress from the topic at hand.

I posted recently about three different approaches to deduplication timing: inline, post process and concurrent process. I talked about the benefits of each and highlighted the fact that post process and concurrent process benefit from the fastest backup performance since deduplication occurs outside of the primary data path while inline benefits from the smallest possible disk space since undeduplicated data is never written to disk. Now comes NetApp with a whole new take. Their model combines the worst of post process and inline, by requiring a disk holding area and reduced backup performance. After all this time developing the product, this is what they come up with? Hmmm, maybe they should stick to filers.

Categories
Backup Deduplication Restore

Deduplication: It’s About Performance

I have recently been thinking about the real benefits of deduplication. Although the technology is all about capacity, when you analyze the cost and benefits in the real world, the thing that jumps out at you is performance.

Performance is the key driver in sizing and assessing the number of units required. That means it also drives cost. Deduplication enables longer retention but usually reduces backup and restore performance. For example a 40 TB system can hold 800 TB of data assuming a ratio of 20:1. This is a large number, but it soon becomes clear that the system’s capacity is limited by backup speed. The graph below shows the relationship between data protected and backup window assuming performance of 400 MB/sec.


Click for larger image

Categories
Marketing

Tradeshow giveaway gone bad: the video

Tradeshow marketers spend hours trying to scheme up new and unique programs to drive booth traffic and these often include free giveaways. Ironically, the simplest things such as t-shirts or bags can be good traffic generators, and it is amazing that people can get so excited about tchotchkes that cost $2 or less.

One common approach is a two tiered program where you hand out an inexpensive item (like a t-shirt) and tell booth visitors that they must be wearing it to be eligible for a future drawing for a more expensive item. In order for this to work, the vendor must have an ample supply of the initial giveaway and the final item must be of high enough value to encourage participation. As you can imagine, marketers spend a ton of time and money putting together these programs.

Now fast forward to the recent VMWorld show, FalconStor used a two tier program where they offered free t-shirts at their booth and then had a drawing for a Segway scooter. The program stipulated that attendees must be wearing the FalconStor t-shirt at the time of the drawing to be eligible.

Well, in classic case of sales people ignoring the marketing people, the sales folks at the booth picked a winner who was not wearing a t-shirt and decided to give him the Segway anyway. This contradicted the terms of the program and the audience did not react favorably. This is a marketers worst nightmare; their carefully orchestrated program has been ruined and it is clear that many booth visitors left feeling angry. Click more to see the YouTube video which shows what happened; it is quite humorous and makes you wonder “what were they thinking?”

Categories
Deduplication

HIFN – Commoditizing hash-based deduplication?

HIFN recently announced a card that accelerates hash-based deduplication. For those unfamiliar with HIFN, they provide infrastructure components that accelerate CPU intensive processes such as compression, encryption and now deduplication. The products are primarily embedded inside appliances, and you may be using one of their products today.

The interesting thing about the HIFN card is that they are positioning it as an all-in-one hash deduplication solution. Here are the key processes that the device performs:

  1. Hash creation
  2. Hash database creation and management
  3. Hash lookups
  4. Write to disk
Categories
General Marketing

How a lack of innovation put Overland under water

I wanted to post a quick commentary on Overland Data.

I recently ran across this post over at The Register that discusses the fact that Overland Data is at risk of being delisted from the NASDAQ due to a stock price below $1. (Ticker: OVRL, currently $.45)

In a past life, I sold Overland products and was very familiar with their tape and disk systems. They were one of the first companies to provide a cost effective D2D solution targeted at data protection. In 2003, they unveiled the REO 2000 product and 7 months later, they released the REO 4000 which provided greater capacity and scalability. Overland was on a roll with the new REO appliances, generating industry buzz and excitement while their tape library business remained strong.

Fast forward five years, and Overland’s situation looks bleak. Their D2D products have stagnated and their tape business has collapsed. Along the way, they have made a number of false starts including the purchase of Zetta Systems and the launch of the Ultamus array, which they later silently pulled from the market.

Situations like these make you realize the importance of innovation. Initially, Overland was very successful with their disk products, but were unable to maintain their position. As the market innovated, they did not and their financial and business performance suffered. Their current situation is a reminder that you must innovate or risk suffering a similar fate. Steve Jobs said this eloquently:

Innovation distinguishes between a leader and a follower.

I feel fortunate to be working for a company that has a long history of innovation in data protection and there are more exciting things to come…

Categories
Marketing

Blog commenting

W. Curtis Preston the author of the Mr. Backup Blog recently voiced his frustration with certain bloggers censoring visitor comments. He was annoyed that some folks from EMC configured their blogs for comment moderation (all comments must be approved before they appear on the site) and used the power to delete certain responses. He contrasted this to NetApp whose blogs are not moderated. (As a point of clarification, AboutRestore.com’s comments are not moderated; reader comments are posted immediately.) Whether you believe in comment moderation or not, at least these blogs all provide a mechanism for the visitor to respond.

Categories
Backup Deduplication

Inline Deduplication: What Your Mother Never Told You

I was recently attending a show and enjoyed speaking with a variety of end users with different levels of interest and knowledge. One of the things that I found was that attendees were obsessed with the question of inline vs post process vs concurrent process deduplication. Literally, people would come up and say “Do you do inline or post process dedupe?” This is crazy. Certainly there are differences between the approaches, but the real issue should be about data protection not arcane techno speak.

Before I go into details, let me start with the basics, inline deduplication means that deduplication occurs in the primary data path. No data is written to disk until the deduplication process is complete. The other two approaches post process and concurrent process, first store data on disk and then deduplicate. As the name suggests, post process approaches do not begin the deduplication process until all backups are complete. The concurrent process approach begins deduplication can start before the backups are completed and can backup and deduplicate concurrently. Let’s look at each of these in more detail.

Categories
Deduplication Marketing

Tradeshow perspectives

I spent last week at a tradeshow in New York. These events are interesting because of the various end user perspectives. Those of us in the industry often get embroiled in the minutiae of products and features, and so it is very useful to understand the views of the end users on the show floor. Storage Decisions is a show that prides itself on highly qualified attendees.

One of the most curious things about the show was attendees’ obsession with inline vs post process deduplication. Numerous end users stopped by asking only about when DeltaStor deduplicates data. In the rush of the show, there was little time to discuss the question in much detail. It struck me as odd that these attendees focused on this question which in my opinion is the wrong question to ask. I can only surmise that they had gotten an earful form competing vendors who swore that inline is the best approach.

Categories
General

AboutRestore.com recognition

W. Curtis Preston the author of the Mr. Backup Blog recently posted an article about the blogs that he frequents. I was honored that he recognized AboutRestore.com along with blogs from other major vendors.

Curtis mentioned his frustration with the comment filtering policies on some blogs and I wanted to clarify AboutRestore.com’s policy. (A synopsis of the policy is contained in the disclaimer in the sidebar.) Comments are not moderated; whatever you post appears on the site instantly. I have little interest in censorship; however, I reserve the right to delete comments containing abusive or personal attacks. I hope I never have to use my power of deletion, but as Uncle Ben said to Peter Parker/Spiderman:

With great power comes great responsibility.

Now back to regularly scheduled programming…..