Deduplication Virtual Tape

The Demise of the NearStore VTL: A historical perspective

Rumors have been circulating for months about the demise of NetApp’s VTL offering. Today, Beth Pariseau from SearchDataBackup published the first public confirmation that development on the product has ceased. It is not a surprise, but makes for an interesting case study.

NetApp acquired VTL technology with their purchase of Alacritus for $11 million back in 2005. Alacritus provided a software only VTL solution that ran on a Linux platform. Their product specifications appeared impressive, but they had limited success in the US. Our partners in Asia saw them more frequently. For NetApp, the acquisition made sense because it represented a relatively cost-effective entry into the rapidly growing VTL market. However, as in most things, the difficulties were in the details.

NetApp’s core intellectual property is their ONTAP operating system and associated WAFL filesystem. These components provide the intelligence and value-added features of their arrays. The challenge for NetApp after acquiring Alacritus was the integration of the two technologies.

Backup D2D Deduplication Restore Virtual Tape

Streaming LTO-5

Chris Mellor (twitter:@Chris_Mellor) recently posted an article over at The Register about LTO-5 entitled Is LTO-5 the last harrah for tape?.  He makes an interesting point about the future of LTO and whether LTO-5 will be the last generation of the technology.  Most of the comments on the article disagree with Chris’s opinion.

I believe that there is another major issue with LTO-5 that must be addressed.  The challenge with LTO (and most other tape technologies) is its limited ability to throttle performance.  Users must carefully manage their environment to ensure that they stream their drives or else backup performance will decline dramatically.  As drives become faster, the challenge of optimizing your environment for the technology becomes more difficult.  You can read more about this in my blog post entitled The Fallacy of Faster Tape.

Deduplication Virtual Tape

When is a node not a node?

One of the things that irks me is when press/analysts/vendors compare a competitor’s solution to a one node SEPATON solution.  SEPATON’s VTL as well as our DeltaStor deduplication and DeltaRemote replication products rely on our DeltaScale™ architecture which is designed around the concept of grid scalability.  The grid allows us to scale dynamically and transparently across multiple independent nodes.  This is very different from competing solutions that rely on a monolithic server approach.

Backup Deduplication Virtual Tape

War Stories: Diligent

As I have posted before, IBM/Diligent requires Fibre Channel drives due to the highly I/O intensive nature of their deduplication algorithm. I recently came across a situation that provides an interesting lesson and an important data point for anyone considering IBM/Diligent technology.

A customer was backing up about 25 TB nightly and was searching for a deduplication solution. Most vendors, including IBM/Diligent, initially specified systems in the 40 – 80 TB range using SATA disk drives.

Initial pricing from all vendors was around $500k. However as discussions continued and final performance and capacity metrics were defined, the IBM/Diligent configuration changed dramatically. The system went from 64TB to 400TB resulting in a price increase of over 2x and capacity increase of 6x. The added disk capacity was not due to increased storage requirements (none of the other vendors had changed their configs) but was due to performance requirements. In short, they could not deliver the required performance with 64TB of SATA disk and were forced to include more.

The key takeaway is that if considering IBM/Diligent you must be cognizant of disk configuration. The I/O intensive nature of ProtectTier means that it is highly sensitive to disk technology and so Fibre Channel drives are the standard requirement for Diligent solutions. End users should always request Fibre Channel disk systems for the best performance and SATA configurations must be scrutinized. Appliance-based solutions can help avoid this situation by providing known disk solutions and performance guarantees.

Backup Deduplication Restore Uncategorized Virtual Tape

SEPATON Performance — Again

Scott from EMC has challenged SEPATON’s advertised performance for backup, deduplication, and restore. As industry analyst, W. Curtis Preston so succinctly put it, “do you really want to start a ‘we have better performance than you’ blog war with one of the products that has clustered dedupe?” However, I wanted to clarify the situation in this post.

Let me answer the questions specifically:

1. The performance data you refer to with the link in his post three words in is both four months old, and actually no data at all.

SEPATON customers want to know how much data they can backup and deduplicate in a given day. That is what is important in a real life usage of the product. The answer is 25 TB per day per node. If a customer has five nodes and a twenty-four hour day, that’s 125 TB of data backed up and deduplicated. This information has been true and accurate for four months and is still true today.

Deduplication Virtual Tape

Customer perspectives on SEPATON, IBM and Data Domain

SEPATON issued a press release on Monday that is worth mentioning here on the blog. SearchStorage also published a related article here. The release highlights MultiCare a SEPATON customer that uses DeltaStor deduplication software in a two-node VTL.

In the release, the customer characterizes their testing of solutions from Diligent/IBM (now IBM TS7650G) and Data Domain. Specifically, they mention that the TS7650G was difficult to configure and get running and that the gateway head nature of the product also made it difficult for them to scale capacity. These difficulties illustrate the challenges of implementing the TS7650G’s head only design. With this solution, the burden of integrating and managing the deduplication software and disk subsystem falls on the end user. Contrast this with a SEPATON appliance that manages the entire device in a fully integrated, completely automated fashion.

They had a typical Data Domain experience. That is, their initial purchase looked simple and cost effective but rapidly become complex and costly. In this case, MultiCare hit the Data Domain scalability wall, requiring them to purchase multiple separate units. The result is that MultiCare had to perform two costly upgrades and had to rip and replace their Data Domain solutions with newer, faster units. Scalability is the challenge with Data Domain solutions and it is not uncommon for customers to purchase one unit to meet their initial needs and then be forced to add additional units or perform a forklift upgrade.

As MultiCare found, customers must thoroughly understand their requirements when considering deduplication solutions. They tested the head-only approach and found it to be too complex to operate and manage to meet their needs. They tried the small appliance approach and found that they outgrew their initial system and were forced to pursue costly upgrades. In the end, they recognized that the best solution for their environment was a highly scalable S2100-ES2 solution which provided the performance and scalability that could not be achieved with either the TS7650G or Data Domain.

Deduplication Virtual Tape

Falconstor, SIR and OEMs

This article on highlights enhancements to FalconStor’s SIR deduplication platform, but I have to wonder whether anyone cares. FalconStor was a big player in providing VTL software to OEMs; but their deduplication software has been largely ignored.

FalconStor had their heyday in VTL. They aggressively pursued OEM deals with large vendors including EMC, IBM, and Sun. EMC was the most successful with their EDL family of products. As the market moved to deduplication, you would think that FalconStor would be the default OEM supplier of deduplication software as well. You would be wrong.

Ironically, FalconStor’s VTL success was their downfall in deduplication. Their OEMs realized that they were all selling the same VTL software and did not want to repeat the situation with deduplication.  EMC and IBM, have already announced that they are using alternative deduplication providers.

Deduplication Virtual Tape

Choosing a Data Protection Solution in a Down Economy

I hate to turn on the TV these days because it is full of bad news. There always seems to be some pundit talking about troubles in the housing market, credit markets, automotive industry, consumer confidence and so many other areas. It does not take a rocket scientist to recognize that the economy is in tough shape right now. As a reader of this blog, you are likely feeling some of the pain in your budget. This obviously brings up an important question: how do I justify IT purchases in these environments.

In situations like these, IT departments must go back to the basics. Purchases must be all about ROI. You must look beyond just acquisition cost and consider how a given solution can save your organization money both upon acquisition and into the future.

Deduplication Virtual Tape

NetApp Dedupe: The Worst of Inline and Post-process Deduplication

NetApp finally entered the world of deduplication in data protection. While they have supported a flavor of the technology in their filers since May 2007, they have never launched the technology for their VTL. Why? Because their VTL does not use any of the core filer IP. It relies on an entirely separate software architecture that they acquired from Alacritus. Thus all the features of ONTAP do not apply to their VTL. However, I digress from the topic at hand.

I posted recently about three different approaches to deduplication timing: inline, post process and concurrent process. I talked about the benefits of each and highlighted the fact that post process and concurrent process benefit from the fastest backup performance since deduplication occurs outside of the primary data path while inline benefits from the smallest possible disk space since undeduplicated data is never written to disk. Now comes NetApp with a whole new take. Their model combines the worst of post process and inline, by requiring a disk holding area and reduced backup performance. After all this time developing the product, this is what they come up with? Hmmm, maybe they should stick to filers.

Restore Virtual Tape

Data protection and natural disasters – Part 2

In part 1, I touched on four of the most common challenges with data restoration in a disaster scenario. In this post, I will review some other key considerations. These examples focus on the infrastructure required after a disaster has occurred.