Categories
Backup Deduplication

IBM Storage Announcement

As previously posted, I was confused about the muted launch of IBM’s XIV disk platform. Well, the formal launch finally occurred at IBM Storage Symposium in Montpelier, France. Congratulations to IBM, although I am still left scratching my head why they informally announced the product a month ago!

Another part of the announcement was the TS7650G which is Diligent’s software running on an IBM server. Surprisingly, there is not much new; it appears that they are banking on the IBM brand and salesforce to jumpstart Diligent’s sales. Judging by the lack of success in selling the TS75xx series, it will be interesting to see whether they will have any more success with this platform.

From a VTL perspective, IBM has backed themselves into a box. Like EMC, they have a historic relationship with FalconStor and have chosen a different supplier for deduplication. This creates an interesting dichotomy. Let’s look at the specs of their existing FalconStor-based VTL and newly announced technology.

Categories
Backup Deduplication Restore Virtual Tape

Keeping it Factual

I periodically peruse the blogosphere looking for interesting articles on storage, data protection and deduplication. As you can imagine, blog content varies from highly product centric (usually from vendors) to product agnostic (usually from analysts). I recently ran across a post over at the Data Domain blog, Dedupe Matters. This is a corporate blog where it appears that the content is carefully crafted by the PR team and is updated infrequently. Personally, I find canned blogs like this boring. That said, I wanted to respond to a post entitled “Keeping it Real” by Brian Biles, VP of Product Management. As usual, I will be quoting the original article.

A year or more later, Data Domain is scaling as promised, but the bolt-ons are struggling to meet expectations in robustness and economic impact.

Categories
D2D Deduplication Virtual Tape

Analyst Commentary on VTL

I am often perusing industry related sites to find what people are saying about disaster recovery and data protection. Most of these sites rely on independent contributors to provide the content. Given the myriad of viewpoints and experience levels, it is not uncommon to see a wide range of commentaries, some consistent with industry trends, and others not. I keep this in mind when reading these articles and generally ignore inconsistencies; however once in a while an article is so egregiously wrong that I feel a response is necessary.

In this case, I am referring to an article appearing in eWeek where the author makes gross generalizations about VTL that are misleading at best. Let’s walk through his key points:

VTLs are complex

I completely disagree. The reason most people purchase VTLs is that they simplify data protection and can be implemented with almost no change in tape policies or procedures. This means that companies do not have relearn new procedures after implementing a VTL and thus the implementation is relatively simple and not complex like he suggests.

He also mentions that most VTLs use separate VTL software and storage. This is true for solutions from some of the big storage vendors, but is not the case with the SEPATON S2100-ES2. We manage the entire appliance including storage provisioning and performance management.

Finally, he complains about the complexity of configuring Fibre Channel (FC). While it is true that FC can be more complex than Ethernet it really depends on how you configure the system. One option is to direct connect the VTL which requires none of the FC complexities he harps on. He also glosses over the fact that FC is much faster than the alternatives which is an important benefit. (My guess is that he is comparing the VTL to Ethernet, but he never clearly states this.)

Categories
D2D Deduplication Virtual Tape

Tape is not dead!

I am amazed when I hear some vendors aggressively promote that tape is dead. It seems that hyping the demise of tape is in vogue these days and the reality is quite different. Even so,  there is no stopping them from sharing their message with anyone who will listen. If you ask large enterprises, many of them are looking at alternatives to tape, but telling them that tape is completely dead and that they should rip out all tape hardware is ludicrous. Ironically, this is the approach of some deduplication vendors.  Jon Toigo states this succinctly in his blog.

The problem with tape is that it has become the whipping boy in many IT shops.
Courtesy: Drunken Data

The simple reality is that tape has been an important component of data protection for years and is likely to maintain a role far into the future. The reader should remember that in today’s highly regulated environments, companies often face strict requirements about data retention. For example, medical institutions can face some of the most stringent requirements:

HIPAA’s Privacy Rule, in effect since 2003 or 2004 depending on the size of the organization, requires confidentiality of patient records on paper and sets retention periods for some kinds of medical information, regardless of media. These retention requirements can stretch from birth to 21 years of age for pediatric records, or beyond the lifetime of the patient for other medical records.
Courtesy: Directory M

With this in mind, let’s look at the evolution of tape:

Categories
Deduplication Restore

Deduplication and restore performance redux

A week ago, I wrote an article highlighting how deduplication can impact restore performance and the difference between forward and reverse referencing. Many people are not familiar with these two deduplication technologies and their importance.  SEPATON is the only vendor to implement forward referencing technology in a large scale enterprise appliance and it is important to understand why we did that.

Lauren Whitehouse from the Enterprise Strategy Group posted an article on a similar topic on Searchstorage.com on 8/11/08. It is gratifying to know that I am not the only one focused on the importance of deduplication and restore performance!

Categories
Deduplication Restore

Deduplication and restore performance

One of the hidden landmines of deduplication is its impact on restore performance. Most vendors gloss over this issue in their quest to sell bigger and faster systems. Credit goes to Scott from EMC who acknowledged that restore performance declines on deduplicated data in the DL3D. We have seen other similar solutions suffer restore performance degradation of greater than 60% over time. Remember, the whole point of backing up is to restore when/if necessary. If you are evaluating deduplication solutions, you must consider several questions.

  1. What are the implications to your business on the decreasing restore performance?
  2. What is it about deduplication technology that hurts restore performance?
  3. Can you reduce the impact on restore performance?
  4. Is there a solution that does not have this limitation?
Categories
Backup Deduplication Restore

DL3D Discussion

There is an interesting discussion on The Backup Blog related to deduplication and EMC’s DL3D. The conversation relates to performance and the two participants are W. Curtis Preston the author of the Mr. Backup Blog and the The Backup Blog’s author, Scott from EMC.  Here are some excerpts that I find particularly interesting with my commentary included. (Note that I am directly quoting Scott below.)

VTL performance is 2,200 MB/sec native. We can actually do a fair bit better than that…. 1,600 MB/sec with hardware compression enabled (and most people do enable it for capacity benefits.)

The 2200 MB/sec is not new, it is what EMC specifies on their datasheet; however, it is interesting that performance declines with hardware compression. The hardware compression card must be a performance bottleneck. Is the reduction in performance of 28% meaningful? It depends on the environment and is certainly worth noting especially for datacenters where backup and restore performance are the primary concern.

Categories
Backup Deduplication

6 Reasons not to Deduplicate Data

Deduplication is a hot buzzword these days. I previously posted about how important it is to understand your business problems before evaluating data protection solutions. Here are six reasons why you might not want to deduplicate data.

1. Your data is highly regulated and/or frequently subpoenaed
The challenge with these types of data is the question of whether deduplicated data meets compliance requirements. John Toigo over at Drunken Data has numerous posts on this topic including feedback from a corporate compliance usergroup. In short, the answer is that companies need to carefully review deduplication in the context of their regulatory requirements. The issue is not of actual data loss, but the risk of someone challenging the validity of subpoenaed data that was stored on deduplicated disk. TThe defendent would then face the added burden of proving the validity of the deduplication algorithm. (Many large financial institutions have decided that they will never deduplicate certain data for this reason.)

2. You are deduplicating at the client level
Products like PureDisk from Symantec, Televaulting from Asigra or Avamar from EMC deduplicate data at the client level. With these solutions, the client bears burden of deduplication and only transfers deduplicated (e.g. net new) data across the LAN. The master server maintains a disk repository containing only deduplicated data. Trying to deduplicate the already deduplicated repository will not result in storage savings.

Categories
Backup Deduplication Restore Virtual Tape

DeltaStor Deduplication, cont….

Scott from EMC and author of the backup blog responded to my previous post on DeltaStor. First, thank you for welcoming me to the world of blogdom. This blog is brand new and it is always interesting to engage in educated debate.

I do not want this to go down a “mine is better than yours” route. That just becomes annoying and can lead to a fight that benefits no one. I am particularly concerned since Scott, judging by his picture on EMC’s site, looks much tougher than me! 🙂

The discussion really came down to a few points. For the sake of simplicity I will quote him directly.

So, putting hyperbole aside, the support situation (and just as importantly, the mandate to test every one of those configurations) is a pretty heavy burden.

DeltaStor takes a different approach to deduplication than hash-based solutions like EMC/Quantum and Data Domain. It requires SEPATON to do some additional testing for different applications and modules. The real question under debate is how much additional work. In his first post, Scott characterized this as being entirely unmanageable. (My words, not his.) I continue to disagree with this assessment. Like most things, the Pareto Principle applies here (Otherwise known as the 80-20 rule.).  Will we support every possible combination of every application, maybe not.  Will we support the applications and environments that our customers and prospects use? Absolutely.

Categories
Backup Deduplication Restore

Deduplication, do I really need it?

I’m always puzzled when a customer tells me that they “need deduplication to run my backups better.” This drives me nuts. Deduplication in and of itself doesn’t make your backups run better. In fact some technologies make backups and restores take a lot longer. These customers aren’t really thinking about the root causes of their problems. They are like patients who see an ad for a prescription medication on TV and decide they need some. When the doctor asks why, the response is “because the ad sounded like something that will make me feel better.” Sounds ludicrous, doesn’t it? Well, that is no different from the dedupe statement above.

The simple reality is that, like the prescription drugs on TV, dedupe is not a silver bullet. It solves specific problems such as retention and reduction in $/GB. If that is your problem, then by all means please look at dedupe. But please, understand the problem you are trying to solve. Trust me, you don’t want to be the guy taking the drugs just because they sound good on TV.