Skip to content

Spectra Logic Backup and Recover Blog

2011 Industry Observations and Trends

We’ve all been privy to the countless articles hitting every technology journal around the world with predictions, forecasts, or trends for the upcoming year.  This is no different from any other year, and is virtually an industry ritual that sets the tone for the upcoming year. Amongst these trends, some were obvious, some were reasonable, and one in particular seemed to catch many people off guard: Tape is back, or more precisely, it never went anywhere.

From CNN’s shock at discovering that Google (GOOG) still uses tape backups as their final tier* , to Oracle announcing their latest 5TB Tape technology, tape has prevailed as one of the busier talking points so far this year. Obviously for Spectra Logic, this is neither a problem nor a surprise, but many people are probably asking themselves, "Why? Why Now?" or even "I thought Tape was Dead?" In order to understand this trend, let’s take a look at the other major trends forecasted for 2011: Cloud, Storage Virtualization, Acquisitions, and Overall cost reduction.

As far as cloud is concerned, whether it be a private, hybrid, or public cloud, tape is a logical tier for any hosting infrastructure. It is the strong silent partner, if you will, for two primary reasons: tape continues to be the most cost-effective format to store data on, and tape provides an offline copy of the data for added security. My college computer security professor used to refuse to plug his computer into the internet on the principle that nothing online is ever 100% secure. Unfortunately, in the era of viruses, worms, malicious attacks, and even software glitches, bugs and data corruption, this sentiment is all too true. I bet the Parish of Orleans Civil District Court will take a much closer look at their cloud service provider’s storage method moving forward after losing large amounts of data due to simultaneous disk crashes in a tapeless environment. Thankfully for the Court, they still have paper records to retrieve from**.

Our second case is that of storage virtualization. In 2010, the Active Archive Alliance was formed by Spectra Logic, FileTek, Qstar, SGI, and Compellent with the intention of educating and promoting the concept of active archiving, or extending a file system across multiple storage devices in a virtualized storage pool. With server virtualization dominating the market in 2010, it only makes sense that the virtualization trend would continue throughout the storage infrastructure. The Alliance, however, was not alone in their efforts to reintroduce the concept of seamlessly tiered storage. With data volumes growing in the Exabytes and floor space, power and cooling costs increasing, tape is the ideal resting point for generally inactive data.  Even in the era of deduplication, MAID, Thin Provisioning, and other power-saving technologies, tape continues to lead the charge for power efficiency and storage density. Why? Tape is designed to be stored offline, which consumes no power. Additionally, IBM and FUJIFILM have proven that tape is far from reaching its physical limitations for storage density with their 35TB prototype tape***. 

Ultimately, our final two trends answer tape’s role in the prior. Oracle now has an investment in tape technology through their acquisition of Sun, and thereby StorageTek’s, tape technologies. IBM, HP, Dell, and Quantum similarly have made an investment in tape technology.  Additionally, many of the loudest voices against the tape market have been acquired, some by companies with tape interests, leaving only one large player still beating the "Tape is Dead" drum: EMC.  So ask yourself... why would a marketing powerhouse spend such energy on anti-tape promotions, if it weren’t a threat on their radar?

These acquisitions have opened the airwaves for the pro-tape messaging to once again make its way into everyday dialog. Why? Because, like our final trend, it's about overall cost reduction. With tape remaining the leader in both low-cost capital expense and low-cost operational expense storage, and the integration of other technologies with tape, it is once again being discussed as a viable, valuable tier within any datacenter design. 

*http://tech.fortune.cnn.com/2011/02/28/google-goes-to-the-tape-to-get-lost-emails-back/

**http://itknowledgeexchange.techtarget.com/storage-soup/i365-involved-in-new-orleans-backup-failure/

***http://www.engadget.com/2010/01/23/ibm-and-fujifilm-develop-35tb-magnetic-tape-cartridges-unveil-i/

Odd to Outrageous

First, the odd.  The enormous ball of tape (blessed by the Guinness Book of World Records, a marvelous device with which to level your desk) that EMC stuck in a London hotel lobby is the biggest ball of tape ever (2 meters across).

Why on earth did EMC create this? The tape ball is already denser than any of their disk.  Admitted to be 1.8 PB native capacity, it comes to 3.6 PB compressed. Alternately, using another statement associated with the same source, the tape for the ball came from 6,000 tape cartridges which, if one assumes them to be LTO5, is 9 PB native/18 PB compressed.

Then there was the motorcycle jump.  The bike and rider jumped over 40 Symmetrix racks totaling 8PB and a distance of 20 meters which is the equivalent of ten tape balls (18 - 180 PB).  Thanks, EMC. You’ve just proven that tape is 2.25 to 22.5 times as dense as disk.

Time-out for a history lesson: a brief look back shows that EMC’s CEO Tucci stated, “Backup to and recovery from tape is dead”as recently as November 2009.  So it must be big news to Tucci that all this time, an EMC VP knew about tape’s sneaky habits. Shane Jackson, VP of marketing for EMC's backup and recovery services division, said, "Tape's had a hideout where customers are using it for long-term data retention for data three to seven years old and beyond."Stop the presses. You mean tape isn’t dead?

And finally, what about the outrageous claim that EMC’s new product is “the industry’s first long term retention system for backup and archive." As analyst, Curtis Preston blogged in his response to the announcement, “Huh?  Isn't tape a unified long term retention system for backup and archive?” Thank you, Curtis. Well said. 

50 TB per Tape --Imagine Disk in the Future

Dear Ms. Meade:

Did you read about the recent breakthrough[1]in tape technology—up to 50 TB per LTO tape? I was also “excited about the recently announced 3TB Seagate hard[2] drive, but if tape devices can go to 50TB imagine what kind of hard drives we'll have in a few years time.”

Signed,

Seeing Only Disk


Dear Seeing Only Disk,

You might want to have your eyes checked. The breakthrough is in TAPE, not DISK. I find it so interesting that readers see the word “tape” and say the word “disk.” My, but the big advertising bucks spent by the Three-Letter-Disk vendor continue to pay off. Disk vendors have managed to obscure advancements in tape and instead bring to mind advancements in disk. Your disk dollars are hard at work with conflation in mind.

It is true that the 50 TB tape uses a perpendicular magnetic recording technology as does disk. In fact, tape and its automation are a lot like disk in other ways. For example, did you know that with Spectra tape libraries, you can invest in a global spare, much like you can have a stand-by in case of a RAID disk failure? And just as disk does several levels of data verification, did you know that tape drives perform a read-after-write and that Spectra libraries verify that data on tape can be retrieved through its PostScan™ feature?

So perhaps disk and tape parallels do hold true to some extent—except for the facts that

  1. Tape doesn’t consume energy just to maintain data, as does disk
  2. Tape media has an archival life of 30 years at a minimum, assuming decent conditions and the availability of a drive to read the data (not that big of a deal, if you store a few components along with the off-site tapes), while disk is typically used for five years or fewer; and
  3. Tape costs less than disk, any way you look at it.

So aside from tape’s higher data density, longer shelf life, portability for disaster recovery, lower purchase price, ease-of-use through a file-system instead of data written in specific backup formats, and greater return on investment, disk and tape ARE a lot alike.

Or not.

Sincerely, and sincerely astounded at your reading of the letters T A P E as disk,

Ms. Meade E. Ahmogle



[1]“50 TB Per Tape Cartridge,” PhysOrg.com, May 19, 2010.

[2]Wilson, Dean. “Hitachi Maxell announce 50TB tape drive,” TechEye.net, 20 May 2010.http://www.techeye.net/hardware/hitachi-maxell-announce-50tb-tape-drive.

Deduplication: A Personality Analysis

I just finished re-reading Keith Schultz’s recent article, which is a test lab review of three different deduplication appliances on the market today.  Spectra nTier Deduplication was one of the three. I always like to read reviews about my products, especially when they are positive.  If you have not read it, I don’t want to spoil the surprise, but will tell you that InfoWorld found that all the systems……. worked.  As much as I wish my deduplication appliance was the only one that worked, I must remain honest.  Most, if not all, mainstream deduplication solutions carry out their primary goal:  reducing your data.   

 

Keith did learn that the systems had different personalities, and in my opinion, almost all of the deduplication systems on the market have a unique personality.  Spectra nTier Deduplicationis VTL based, and well suited in environments with tape-based backup and archive systems.  VTL not only fits well with tape-based environments, it also offers high performance. Additionally,  nTier’s ability to send block data over Fibre Channel offers significant performance advantages over NAS. 

 

Personality can open doors, but only character can keep them open. Our nTier definitely has character, and its personality is all about both high performance and tight tape integration for backup and archive.

 

Each deduplication solution has its own distinct personality, and no single product can do everything.  If you chose a dedupe appliance with a NAS interface, you may sacrifice performance in exchange for the ability to host some production data.  For some customers, this will be a good trade, for others it will not be acceptable or provide efficient backup and archive.t.  nTier Deduplication is designed for backup and archive data, and does a great job with that work flow, but was not designed to be used  as a primary storage device. 

 

What should an organization consider when looking at a deduplication solution?  First, they need to know how they will use it and what is important.  Is integration with backup software and performance more critical than other factors like performance?  Once the organization has a good feel for the personality of the device they need, it should be easy to reduce the number of systems to consider to a “short list” of two or three solutions.  After that, learn more about each one and if need be, test them before purchasing.

 

Follow me at www.Twitter.com/3pedal

  

More Storage Space, Please

I am amazed at how easy it is for people to consume all of their available storage space—and how quickly we can do it.  Just looking at my own storage situation, it is remarkable how much “stuff” I have to store; a snow blower, coolers, lots of car parts, bike parts, camping gear and who knows what else.   My old storage shed was recently hit by a tree, so I had to replace it this month.  I purchased the biggest one allowed by city zoning regulations, and already…. I realize I should have gone bigger. 

 
Storing data can be pretty similar.  No matter how much space you have, the data manages to grow to fill it.  Sometimes, the data starts to outgrow the new system before it is even installed.  The growth can seem exponential when you are managing data backup and archive processes, where 1 TB of primary data can grow to 20 TB of backup data, thus, the value of Deduplication.   Why do we have all of this data to begin with, and what we should do with it long term?, This might be the topic for a future blog post, but today, let’s focus on knowing we need space to store it and how to address that need with disk. Data center space constraints and a focus on cost savings cause storage admins to seek efficient, high density backup and archive disk solutions.  Spectra Logic’s new higher capacity nTier 700 disk appliance delivers on these requirements. 
 
Most disk arrays have some ability to grow and expand, but at the hidden cost of more rack space.  The Spectra nTier 700 can grow to 60 disks in a single 4U enclosure.  With Spectra Logic’s recent announcement of 2 TB drives in the nTier700, that totals  120 TB of capacity and 16GB of memory in a 4U chassis- for as low as $1.00 per GB. 
 
If you are up against a physical size and/or budgetary limitations in your data center— like I am in my back yard—Spectra Logic can help you scale to fit your needs.
 

Backup, Archive, HSM - What's the Difference Anyway?

Part One or Two

One of the interesting things I have discovered since I have been talking with so many HPC customers is that the term “backup” is seldom used.  You might ask if they aren’t doing traditional backups, then why would we, a backup solutions provider,  want to talk to them. Well, first you need to fully understand the difference between backup and archive.  Archive is a word you will hear more often in the HPC and M&E environments, especially if there is data in excess of the petabyte range and large files that aren’t accessed frequently but need to be kept indefinitely. 

In this blog, which is the first of a two part series, I will provide some fundamental information that can help you differentiate backup from archive.  In the subsequent blog, part two, we will peel the covers back on the process that is different from backup and archive and similar to the traditional HSM (Hierarchical Storage Management). This information will prove to be valuable for those HPC or other data intensive customers who may claim that they don’t do backups.  Stay tuned for more on this subject later.

The differences between backup and archive:

Backup: simply refers to the creation of a copy of data and storing it somewhere for restoration in the event the original version of the data was compromised in some way.  We evangelize the concept of backups because we know, and most customers realize, that data can accidentally be deleted, corruption could occur, data loss, or even worse, a natural disaster could wipe out the entire data center.

Backup is simply safeguarding or protecting the data that is being used by duplicating that data.  This is usually done in a rotating cycle or through schedules including: daily incremental which are kept for seven days, a weekly full kept for a month, a monthly full kept for a year and a yearly full kept for seven years.  Although this process has proven effective and most of the backup applications on the market today are ideal for doing this, problems occur when you start having multiple copies of the same data consuming a lot more hardware than necessary, not to mention the associated costs of running and managing that hardware. 

With backup – think business continuity

One of the key differences when comparing backup strategies to  archiving, is the difficulty of singling out select files for long term retention.  Everything in the backup gets lumped into the large full backup at the end of the year or seven years and called an “archive”.  It may in fact be called an archive but a recovery would function more like a backup recovery, which could be very costly and time consuming.  Backup strategies are more for business continuity purposes and not necessary for long term archiving.

With archive – think long-term retention

Archive: The main difference between an archive and a backup is that an archive refers to a single collection of records or data that is designated for long-term retention.  When the data is moved from the production environment to the archive environment it is tagged or indexed by metadata that assists in quickly locating that particular file or chunk of data through a search mechanism.  This process and the sophisticated software that performs it make locating a single file much more efficient than it would be in a traditional backup.  An archive is generally found in a common file system structure and the determination of where the file is located is a function of file system.  The file system may have several different storage devices that the archived data is stored on based on a number of attributes such as size, type, last accessed, etc.  This system could be a combination of expensive disk, such as fiber channel, less expensive disk, such as SATA or SAS and tape.  The key is how the data is “structured.”  In most cases, the data may never be accessed again, but it is necessary to keep it for historical purposes, regulatory compliance or unplanned event.  The goal with creating an archive is to keep it separate from the backup rotation cycle.  It is recommended that a separate copy of the archived data be made and kept in a separate location so there are at least two copies of the final archive.

Many environments will include both backup and archive.  Through the use of sophisticated software features that are available today, customers can establish policies that determine type, size, age, last accessed, remaining disk space and other characteristics of stored data that can automate the process of deciding whether to keep the data in the backup cycle or move it to the archive pool.

These two functions can be performed within a single library in separate partitions.  The software can then provide notification of what tapes need to be exported based on the function that was performed on those tapes, backup or archive.  I have seen numbers as high as 80% indicating how much data is duplicated within a storage infrastructure because the differences between backup and archive aren’t fully understood.  At the end of the day, knowing the difference and the benefits of backup and archive technologies, when to use them and how to balance the the two functions in an environment can drastically reduce the amount of redundancy, complexity and storage operating costs.

In Part Two of this discussion, we will look at how archives that contain production data, no matter how old or infrequently accessed, can still be retrieved online using high density and high speed tape systems and secondary disk systems.  Stay tuned for my next post which will look at enduring access to data. 

Want to talk more? I’ll be in Dearborn Michigan at the IDC HPC User Forum and DICE Alliance 2010 events next week. Contact me at jimm@spectralogic.com.

No Question About it: Sometimes Tape is the Answer

Dear Ms. Meade:
My data crunching company generates 2-3 TB of data per customer, and I need to store that somehow. However, I don’t have room for a tape library. The only thing I can think to do is put the data on some hard drives using Linux-based RAID software, then put the disk in a safety deposit box. Do you have any other suggestions?”
Sincerely,
Short on Space

Dear Ms. Space:
You have money and room for terabytes of disk storage, which you will squirrel away in a pretty large safety deposit box, but not a couple of dollars and rack units for a small library? Hmmm.

In pondering a polite answer to this question, Ms. Meade called to mind something similar posted on Slashdot, and is heartened that several intelligent points were discussed in that context. Being one to always encourage others in the path of light, Ms. Meade will summarize these intelligent comments and add to them.

The Short Version, by the way, in case you are averse to reading: Buy an LTO-4 tape drive and LTO-4 tapes. Forget the disk.

The Long Version: As fond as Ms. Meade is of disk, especially Spectra nTier disk, Ms. Meade understands that disk’s greatest asset is the speed at which it retrieves data—NOT its use for secure offline data storage.

Tape is cheaper than disk, even the disk to which you are likely referring. The Tape Equation: you can buy an LTO-4 tape drive for around $1400 (and likely for less), and at $40 per tape, store 800 GB of data; with these and a little compression, you are two tapes away from serious, long-term storage.  Assuming that you have more than a handful of customers annually, this pays for itself pretty rapidly, compared to purchasing cheap (and risky) jbod.

At $100/TB per hard drive and twenty customers each with 2 TB of compressed data, annually the company must shell out $4,000 per year. If, instead, the firm purchases a tape drive and LTO media, your costs are under that in just the first year-- about $2,000 for tape, and another $1,400 for the drive. You’ve paid for the drive in one year. After that, you save $60/TB. (That translates to thousands of dollars annually.)

You may want to consider a tape library, which truly are not space- or budget-hogs. Libraries such as the 4U Spectra T50e may be worth the space and time simply in convenience. This depends a great deal on your business volumes and staffing, and Ms. Meade acknowledges constraints due to current recessionary times. However, to emphasize the point: a relatively lightweight investment such as the purchase of a small library can automate data protection—and most companies that deal in data understand that their business also mandates data-caretaking.

For those unenlightened few who say that LTO tape is not a wise choice because eventually new technologies replace older ones, please consider that LTO has been around this past decade and shows no sign of going away. No migration will be necessary for years to come, given that current generations of LTO tape technology read data on tape that is two generations back, and write one generation back. With new generations about every 3 years, and giving the mobility of today’s clientele, the lifespan of at least 6-10 years is likely sufficient for your business requirements.

Frankly, the issue truly cries out for tape, and Ms. Meade is glad to add her voice to those doing the crying out.
 

Question: Can Disk Replace Tape? Answer: Unobtanium

Dear Ms. Meade,
I am charged with architecting a backup system without any single points of failure. Obviously, tape is SO failure-prone that I am not including it at all. How do you think I should configure such a system?
Sincerely,
Tape is Doomed

Dear Doomed,
You are doomed if you rely solely on disk for your data backup.  A possible interpretation of your question may be “How much disk does it take to replace tape?”  The answer is “unobtainium”—that is, you can’t replace tape using disk.

Further, the very concept of single point of failure is terribly funny in a terribly dark way. Failure is inevitable, unless you plan to address human imperfection? What about acts of natural and man-made disaster that may affect the national power grid? Switch problems? What about loose screws, including any screwed-up (or self-perceived screwed over) employee?

Instead, consider asking a question that does have an answer—“How can I reliably protect data?” The answer is “disk and tape.”

Ms. Meade is a major fan of disk with RAID 6, offered in Spectra’s nTier disk. With RAID 6, up to three disks can fail without affecting data integrity. Go disk and go RAID. However, disk (even with RAID 6) can’t be considered failure-proof because it has its own Achilles’ heel (aka single point of failure): the RAID controller. You can have all the data you want on all the spinning disk you want—but if the controller fails, the brains are gone, and the bits and bytes you’ve carefully protected are toast. Whither goest the RAID controller, so goeth the data. Dead controller= permanently decomposed data. So disk alone, even with the marvels of RAID, is not enough to provide true disaster recovery and continuity of operations.

Further, please note that your information about tape as failure-prone is completely wrong. Tape is, it turns out, incredibly reliable.  With tape’s reliability increase of 700% over the last decade, multiple layers of ECC protection, and smart Spectra libraries tracking media and drive health, tape meets and beats disk in terms of reliability. If you’re worried about a single point of failure,  make sure you get two tape drives. Consider the T950 and T-Finity libraries’ global spare feature—which is an installed drive that can be directed to take over in case of a drive failure.

Ms. Meade admits that she is curious about the pointy-haired boss who directed you to create the no single point of failure unobtanium backup environment….

 

On the ninth day

The big day is getting closer, and I am really busy.  I even multitask lunch, doing a little gift shopping around noon.  It is amazing how many other people had the same idea.  Today I went out and got my parents an external hard drive.  (If you know my parents, please don’t tell them.)  Like everyone else, digital pictures and other media are becoming important to them, and I want to make sure they have a backup they can take out of the house.  (In this case to some one else’s home)  While it might not be the most fun gift, it should be very practical, and if their 5 year old computer crashes, they will be able to get the important stuff back.  (you can guess what I will be setting up over the weekend).

 

We all know how important it is to protect our data, both at home and at work.  Of course a USB hard drive won’t do it at work, there is too much data, we want multiple copies and it isn't very automated.  That is why so many of the conversations we have these days with customers about deduplication include replication.  Quickly getting remote site backups back to a central location automatically is a great stress reliever.  Your company may not need to protect pictures of my nephews, but your data may be almost as important. 

 

nTier Deduplication supports multi-site replication doing the hard work of getting the data off site for you.  Now you can spend more time looking for the perfect gift.  (I like car parts my self.)

 

On the ninth day of Christmas, my true love gave to me;

 

9 Site replication

8 Spectra Archive Files

7 /24 Support

6 T680’s

5 Tapes without Pain!

4 Global Spare Drives,

3 Encryption Keys,

2 Spectra Certified Tapes,

and a large frosty beer.

 

 

On the eighth day...

Like your grandma’s holiday fruit cake that’s been re-gifted every year since the dawn of time,

some business files are meant to last through eternity… or close to it. For these files, you don’t need shellac or any of the other odd preservatives used on aforementioned fruit cake. You need an archive.
 
 
Uncle Sam and a cast of characters will tell you what kind of data to keep and how long you’ve got to keep it. 
However, YOU are the one who gets to put together the Christmas list of storage toys for Santa to squeeze down the chimney.
 
With that in mind, there are no better toys to put on your Christmas list than Spectra Logic storage products for filing away your archive data. Spectra products are fast, reliable, easy to use, and very economical.  Heck, we’ll even assemble ‘em for you!
 
On the eighth day of Christmas, my true love gave to me;
 
8 Spectra Archive Files
7 /24 Support
6 T680’s
5 Tapes without Pain!
4 Global Spare Drives,
3 Encryption Keys,
2 Spectra Certified Tapes,
and a large frosty beer.

More Entries