Backup Restore

Protecting personal data

This blog primarily focuses on protecting corporate data, but I recently received a call from my father that reminded me of the criticality of protecting personal data. My father called expressing frustration that his laptop hard drive had failed and corrupted his data. Fortunately, he had backup copies of his most critical files on a USB stick; however, his email history and address book were not stored on the external device and were lost. I mention this story to remind you of the importance of personal data protection. What are you doing to backup your data?

There are many different approaches to protecting personal data. The two key concerns to consider are:

  1. What happens if I lose the hard drive where my data is stored or experience a software problem such as a virus?
  2. What happens if I suffer a more extreme data loss such as my house burning down?

Each question is critical, and the answer will vary depending on the data. For example, digital pictures of your family might have a different priority than your MP3 library. The former is irreplaceable and the latter is not. These priorities will impact the chosen data protection medium and methodology.

Backup Physical Tape Restore

Tale of the Tape: Musings on IBM’s 35TB Tape Announcement

A recent tweet by Chris Mellor from The Register caught my eye. He highlighted IBM’s recent development of a 35TB tape. Here are four articles on the topic:


FUJIFILM Announcement

The Register Article

A blog post by Robin Harris at ZDnet

My thoughts

It is interesting to see IBM/Fuji driving tape development. With this announcement they have increased native tape capacity over 21x from LTO-5, the newest LTO offering. The dramatic density improvement will drive a continued decrease specification-based $/GB. However it also raises some new questions:


Backup Restore

Lessons from the Sidekick debacle

The latest scary backup story comes from a firm called Danger that makes the Sidekick PDA/phone. The Sidekick stores the majority of its data in a central data center and the data is loaded each time to the phone is restarted. The idea is that the data center provides protection if you lose your phone. A good idea, right?  Well yes, assuming that Danger adequately protects its customers’ data.

A number of outlets are reporting that Danger suffered a catastrophic data loss and all users’ data has been lost. I checked with a family friend who confirmed that her Sidekick was down for a week and is now finally working as a phone, but her data is inaccessible.  This is unacceptable; Sidekick users paid a monthly fee for this service and Danger should have maintained reasonable precautions to protect their customers data.  Clearly this is a bad situation for everyone, and lessons to be learned by all.

Here are some key takeaways from this event.

Backup D2D Deduplication Restore Virtual Tape

Streaming LTO-5

Chris Mellor (twitter:@Chris_Mellor) recently posted an article over at The Register about LTO-5 entitled Is LTO-5 the last harrah for tape?.  He makes an interesting point about the future of LTO and whether LTO-5 will be the last generation of the technology.  Most of the comments on the article disagree with Chris’s opinion.

I believe that there is another major issue with LTO-5 that must be addressed.  The challenge with LTO (and most other tape technologies) is its limited ability to throttle performance.  Users must carefully manage their environment to ensure that they stream their drives or else backup performance will decline dramatically.  As drives become faster, the challenge of optimizing your environment for the technology becomes more difficult.  You can read more about this in my blog post entitled The Fallacy of Faster Tape.

Deduplication Restore

CommVault and Forward Referencing

I was recently reading this document from CommVault that highlights their deduplication technology and was surprised by their use of the term “forward referencing”. Forward referencing is a common term in deduplication with a generally agreed upon definition. CommVault appears to have redefined the word and promoted their version as a feature.  This is confusing and possibly misleading because a reader might not realize that the definition of “forward referencing” in this document is completely different from the one  everywhere else in the industry.

Deduplication Restore

Defragmentation, rehydration and deduplication

W. Curtis Preston recently blogged about The Rehydration Myth. In his post he discusses how restore performance on deduplicated data declines because of the method used to reassemble the fragmented deduplicated data on disk. He also addresses the ways various technologies attempt to overcome these issues, including disk caching, forward referencing (used by SEPATON’s DeltaStor technology) and built-in defrag. In this post I wanted to discuss the last option because it is a widely-used approach for inline deduplication that has some little-known pitfalls.

Backup Deduplication Restore

SEPATON Versus Data Domain

One of the questions I often get asked is “how do your products compare to Data Domain’s?” In my opinion, we really don’t compare because we play in different market segments. Data Domain’s strength is in the low-end of the market, think SMB/SME while SEPATON plays in the enterprise segment. These two segments have very different needs, which are reflected in the fundamentally different architectures of the SEPATON and Data Domain products. Here are some of the key differences to consider.

Backup Deduplication Restore

W. Curtis Preston on physical tape

W. Curtis Preston recently wrote an article on the state of physical tape for SearchDataBackup. He talks about the technologies that backup software vendors have created technology to more effectively stream tape drives. As I posted before, if you cannot stream your tape drives, their performance will decline dramatically.

In enterprise environments, performance is the key driver of data protection. You must ensure that you can backup and recover massive amounts of data in prescribed windows, and tape’s inconsistent performance and complex manageability makes it difficult as a primary backup target. This fact can also make tape a challenging solution in small environments.

The problem with tape drive streaming is a common one and Preston agrees that it is the key reason for adopting disk-based backup technologies. Our customers typically see a dramatic improvement in performance with SEPATON’s VTL solutions since they are no longer limited by the streaming requirements of tape.

Even with new disk and deduplication technologies, most customers are still using tape today and will do so into the future. However, tape will likely be used more for archiving than for secondary storage.  Deduplication enables longer retention, but most customers are probably not going to retain more than a year online. Tape is a good medium for deep archive where you store data for years, but is complex and costly as a target for enterprise backup.

Deduplication Restore

Restore Performance

Scott from EMC posted about the EMC DL3D 4000 today. He was responding to some questions by W. Curtis Preston regarding the product and GA. I am not going to go into detail about the post, but wanted to clarify one point. He says:

Restores from this [undeduplicated data] pool can be accomplished at up to 1,600 MB/s. Far faster than pretty much any other solution available today, from anybody. At 6 TB an hour, that is certainly much faster than any deduplication solution.
(Text in brackets added by me for clarification)

As recently discussed in this post, SEPATON restores data at up to 3,000 MB/sec (11.0 TB/hr) both with deduplication enabled and disabled. Scott insinuates that only EMC is capable of the performance he mentions and I wanted to clarify for the record that SEPATON is almost twice as fast as the fastest EMC system.

Backup Deduplication Restore Uncategorized Virtual Tape

SEPATON Performance — Again

Scott from EMC has challenged SEPATON’s advertised performance for backup, deduplication, and restore. As industry analyst, W. Curtis Preston so succinctly put it, “do you really want to start a ‘we have better performance than you’ blog war with one of the products that has clustered dedupe?” However, I wanted to clarify the situation in this post.

Let me answer the questions specifically:

1. The performance data you refer to with the link in his post three words in is both four months old, and actually no data at all.

SEPATON customers want to know how much data they can backup and deduplicate in a given day. That is what is important in a real life usage of the product. The answer is 25 TB per day per node. If a customer has five nodes and a twenty-four hour day, that’s 125 TB of data backed up and deduplicated. This information has been true and accurate for four months and is still true today.