Categories
Deduplication General Marketing

Exchange deduplication ratio guarantee

Scott over at EMC recently posted his thoughts about deduplication ratios and how they vary widely. I agree with his assessment that compression ratios, change rates and retention are key ingredients in deduplication ratios. However, he makes a global statement, “If you don’t know those three things, you simply cannot state a deduplication ratio with any level of honesty….It is impossible”, and uses this point to suggest that SEPATON’s Exchange guarantee program is “ridiculous”. Obviously the blogger, being an EMC employee, brings his own perspectives as do I, a SEPATON employee. Let’s dig into this a bit more.

As the original author mentioned, the key metrics for deduplication include compression, change rate and retention. Clearly these can vary by data types; however, certain data types provide more consistent deduplication results. As you can imagine, these are applications that are backed up fully every night, have fixed data structures and relatively low data change rates. Some examples include Exchange, Oracle, VMware and others.

Categories
Deduplication Restore

The hidden cost of deduplicated replication

On the surface, the idea of deduplicated replication is compelling. By replicating deltas, the technology sends data across a WAN and dramaically reduces the required bandwidth. Many customers are looking to this technology to allow them to move to a tapeless environment in the future. However, there is a major challenge that most vendors gloss over.

The most common approach to deduplication in use today is hash-based technology which uses reverse referencing. I covered the implications of this approach in another post. To summarize, the issue is that restore performance is impacted as data is retained in a reverse referenced environment. Now let’s look at how this impacts deduplicated replication.

Categories
Restore Virtual Tape

Data protection and natural disasters – Part 2

In part 1, I touched on four of the most common challenges with data restoration in a disaster scenario. In this post, I will review some other key considerations. These examples focus on the infrastructure required after a disaster has occurred.

Categories
Backup Restore

Data protection and natural disasters – Part 1

Hurricane Ike has been in the news lately and my sympathy goes out to all those affected. It is events like these that test IT resiliency. The damage can range from slight to severe and we invest in reliable and robust data protection processes to protect from disasters like this. The unfortunate reality is that, no matter how much you plan for it, the recovery process often takes longer and is more difficult than expected.

In many respects, data protection is an insurance policy. You hate to pay your homeowners premium every month, you do it because you know that it is your only protection if major damage ever happens to your house. In the case of data protection, you invest hours managing your backup environment to enable recovery from incidents like this. The unfortunate reality is that even with the best planning and policies things still may not turn out as expected. Four of the most common pitfalls I hear from customers include:

Categories
Deduplication

A little bit off topic – deduplication and primary storage

I am digressing slightly from my usual data protection focus, but I found a recent announcement from Riverbed very interesting. They are developing a deduplication solution for primary storage. As an employee of a vendor of deduplication solutions, I wanted to provide commentary.

First some background, Riverbed makes a family of WAN acceleration appliances that reduce the amount of traffic sent over a WAN using their proprietary compression and deduplication algorithms. SEPATON is a Riverbed partner and our Site2 software has been certified with their Steelhead platform. (A bit of disclosure here, I have worked with many people from Riverbed in the past including the VP of Marketing.)

Riverbed’s announcement is summarized in posts on ByteandSwitch and The Register. In short, they are developing a deduplication solution for primary storage. It will incorporate their existing Steelhead WAN accelerators and another appliance code named “Atlas” which will contain the deduplication metadata. (The Steelhead platform has a small amount of storage for deduplication metadata since little is needed when accelerating WAN traffic. The Atlas provides the metadata storage space required for deduplicating larger amounts of data and additional functionality.) A customer would place the Steelhead/Atlas appliance combination in front of primary storage and these devices would deduplicate/undeduplicate data as it is written/read from the storage platform. This is an interesting approach and brings up a number of questions:

Categories
Backup D2D Restore Virtual Tape

InformationWeek on NEC HYDRAstor

Howard Marks recently posted an interesting article about NEC’s HYDRAstor over on his blog at InformationWeek. He discusses the product and how the device is targeted at backup and archiving applications. He makes some interesting points and mentions SEPATON. I wanted to respond to some of the points he raised.

…[the system starts with] a 1-accelerator node – 2-storage node system at $180,000…

Categories
Backup Deduplication

IBM Storage Announcement

As previously posted, I was confused about the muted launch of IBM’s XIV disk platform. Well, the formal launch finally occurred at IBM Storage Symposium in Montpelier, France. Congratulations to IBM, although I am still left scratching my head why they informally announced the product a month ago!

Another part of the announcement was the TS7650G which is Diligent’s software running on an IBM server. Surprisingly, there is not much new; it appears that they are banking on the IBM brand and salesforce to jumpstart Diligent’s sales. Judging by the lack of success in selling the TS75xx series, it will be interesting to see whether they will have any more success with this platform.

From a VTL perspective, IBM has backed themselves into a box. Like EMC, they have a historic relationship with FalconStor and have chosen a different supplier for deduplication. This creates an interesting dichotomy. Let’s look at the specs of their existing FalconStor-based VTL and newly announced technology.

Categories
Backup Deduplication Restore Virtual Tape

Keeping it Factual

I periodically peruse the blogosphere looking for interesting articles on storage, data protection and deduplication. As you can imagine, blog content varies from highly product centric (usually from vendors) to product agnostic (usually from analysts). I recently ran across a post over at the Data Domain blog, Dedupe Matters. This is a corporate blog where it appears that the content is carefully crafted by the PR team and is updated infrequently. Personally, I find canned blogs like this boring. That said, I wanted to respond to a post entitled “Keeping it Real” by Brian Biles, VP of Product Management. As usual, I will be quoting the original article.

A year or more later, Data Domain is scaling as promised, but the bolt-ons are struggling to meet expectations in robustness and economic impact.

Categories
Marketing

Industry analysts and conflicts

Just last week I posted commentary on an analyst’s article on eWeek. Ironically, there is currently a hot discussion going on over at ByteandSwitch on another article from the same analyst. (I am purposely not linking to the article, if you want to read it visit B&S and look for Data De-Dupe Guide 2.) In this case, the discussion revolves around the analyst’s objectivity. The reality is that analysts are paid to write vendor centric papers all the time which is not problematic as long as articles are identified as such. The issue here is that the article on ByteandSwitch is vendor centric, and the author is positioning the content as vendor agnostic. The author further compounds the problem by incorrectly summarizing the available capabilities of shipping deduplication solutions. In his Mr. Backup blog, W. Curtis Preston writes about some of the errors.