Skip to content

Spectra Logic Backup and Recover Blog

3 Things to Look for in a Public Cloud Storage Provider

1) Tape: Tape, should and in many cases is, a prominent player in the end to end architecture of a cloud storage provider. As much as we love disk, consider that even in the cloud a copy of all data should be stored on tape. If it isn’t, it is at risk for being lost. Permanently. I did not make this up. Just review news stories about cloud outages with lost data. Replication, snapshots, CDP, RAID, and m-of-n protection are great innovations in disk-based data protection. However, they are not enough. Very large data sets push the error rate thresholds of modern storage systems from statistically negligible to a very plausible reality. The short version: not having an isolated, offline copy, implies an inherent risk, and tape is still the best media suited for offline storage.

 

This copy of data cannot be should not be able to be accessed, changed, or deleted without some form of human intervention or negligence.  With libraries such as Spectra libraries, it’s easy to encrypt the data and store the tape in the library. An encrypted tape stored in an environmentally stable, secure location is the best method for keeping an offline copy. And, as stated many times before, it is still prudent to maintain a copy of your data, regardless of its use model, within your own storage infrastructure.

 

About tape: Tape, not disk, is designed to be well suited to offline and off-site storage. Yes, if you leave it on a heater, in the sun, in your trunk, or next to your electromagnetic generator it probably won’t restore, but if you did that to your disk, the data wouldn’t restore either. If you use proper data management techniques tape is much more very reliable than disk.

 

2. Strong Service Level Agreements (SLA): Make sure your cloud agreement includes SLA’s that align with your usage needs. With the cloud, you get what you pay for. That is both the advantage and risk of using cloud-based storage. If you are using the cloud as an availability or distribution system, then standard SLA’s are most likely fine. However, if it is your sole copy, or only backup of your data, make sure you are investing in a storage service designed to protect that data in the event of an issue at the hosting site. You are only protected as much as your SLA agrees to. If it doesn’t commit to getting your data back in the same condition it was sent (many basic SLA’s don’t) then it isn’t well suited for a backup or worse yet primary target of your company’s assets. Expect your data to be available and healthy, but defend yourself against unexpected outages or data loss by knowing what your SLA agreement is. Also, make sure you know your cloud service provider’s data protection strategy. They may not be willing to share every specific vendor used, but methodology can be disclosed without disclosing specific vendors, which in turn will give you a much more accurate picture of how well your data is protected.

 

3. An Exit Strategy: While the idea is to store data in the cloud, make sure that there is a realistic way to retrieve or migrate your data to another cloud provider or back to your internal systems. This protects your data in the event that either your company discontinues usage or the hosting company discontinues the service.

Further, keep an eye on the amount of data you are storing in the cloud. It is very likely that the amount of data you are storing is very likely to grow over time, and could outgrow the realistic cost/time associated with sending that data across a WAN. Again, tape is an excellent method of handling seeding and exit strategies. Particularly with open formats of tape, such as LTFS or TAR it’s straightforward to transfer data between two heterogeneous environments. In the event that you have hundreds of terabytes or even petabytes, shipping media is often faster and considerably less expensive than paying for the bandwidth required to download that much data. Additionally, in the event that a hosting company goes out of business, open formatted tapes can be distributed even if the entire hosting system is no longer online. It’s just smart to be able to get your data no matter what happens to the host.

Tape: It's not just for Tier 3 Storage Anymore

 

 
There’s a growing trend amongst the storage community to arbitrarily label tape as tier 3 storage.  While the use of tape is rising rapidly, it is not fair to identify it for tier 3 use only. The idea of storage tiers is derived from preconceived hierarchies about how storage must be deployed in a given infrastructure. Given advances in software technology, storage virtualization, and advanced metadata indexing, tape is being adopted more often today as a capacity storage target for primary data. In an active archive, M&E environment, and many other large datacenters, tape is often relied upon heavily to reduce both footprint and energy consumption for primary storage. Active archive has boosted the use of tape in any datacenter, but a common misperception about it is that it must adhere to traditional HSM – Hierarchical Storage Management – classification restraints. This is absolutely not true. Either Tape can be presented as a file system for primary storage in active archives or LTFS in smaller environments. Under traditional definitions, primary storage is usually considered tier 1.
 
This evolution in tape’s use has made it no longer confined to just tier 3 storage or the final tier in an environment.  In a modern datacenter, a single tape library might contain the primary copy of a file, the secondary copy of another file, and a tertiary copy of a third file. In this case, tape is housing 3 different levels of data within the same storage system. This brings to question…. what defines a “tier”? We often think of tape as tier 3 due to the fact that it was usually the “backend” of an infrastructure. However in an active archive, any storage platform can be a direct target for primary data that is still accessible to end users. In this case SSD, disk, and tape can all three simultaneously serve as tier 1 targets, depending on the retrieval time requirements of the data being stored. 
 
When tape is labeled as tier 3 in an active archive, it asserts that a tier is not determined by the data path, but rather by the performance of a given storage system. This gets incredibly complicated in the world of modern storage equipment that often houses multiple media types within the same system. Also, given the flexibility of storage performance, it is very difficult to compare one system to another in a true “like for like” comparison model. Disk’s performance is inherently tied to capacity, but a single enterprise drive alone can only perform at 120MB/s. Disk systems are obviously capable of sending and retrieving data at much faster speeds than this, but drive to drive, tape’s 280MB/s with LTO-5 and  500MB/s performance with the TS1140 drive compressed, by far out matches any disk drive on the market today. Beyond this, the total number of drives, and other considerations determine a storage system’s ultimate performance. We all know that it’s safe to say the performance of tape vs. disk will vary in specific scenarios. Specifications are dependent on the variances of a given configuration and should not be used to determine the hierarchy, so it is unfair to assert tape is always tier 3. At the end of the day, how you design your datacenter will determine the hierarchy of the data path, but to generalize any storage platform as any particular tier is an arbitrary assertion--- not a fact. To permanently label tape as tier 3 is therefore also illogical, and does not belong in discussions about storage systems.
 

2011 Industry Observations and Trends

We’ve all been privy to the countless articles hitting every technology journal around the world with predictions, forecasts, or trends for the upcoming year.  This is no different from any other year, and is virtually an industry ritual that sets the tone for the upcoming year. Amongst these trends, some were obvious, some were reasonable, and one in particular seemed to catch many people off guard: Tape is back, or more precisely, it never went anywhere.

From CNN’s shock at discovering that Google (GOOG) still uses tape backups as their final tier* , to Oracle announcing their latest 5TB Tape technology, tape has prevailed as one of the busier talking points so far this year. Obviously for Spectra Logic, this is neither a problem nor a surprise, but many people are probably asking themselves, "Why? Why Now?" or even "I thought Tape was Dead?" In order to understand this trend, let’s take a look at the other major trends forecasted for 2011: Cloud, Storage Virtualization, Acquisitions, and Overall cost reduction.

As far as cloud is concerned, whether it be a private, hybrid, or public cloud, tape is a logical tier for any hosting infrastructure. It is the strong silent partner, if you will, for two primary reasons: tape continues to be the most cost-effective format to store data on, and tape provides an offline copy of the data for added security. My college computer security professor used to refuse to plug his computer into the internet on the principle that nothing online is ever 100% secure. Unfortunately, in the era of viruses, worms, malicious attacks, and even software glitches, bugs and data corruption, this sentiment is all too true. I bet the Parish of Orleans Civil District Court will take a much closer look at their cloud service provider’s storage method moving forward after losing large amounts of data due to simultaneous disk crashes in a tapeless environment. Thankfully for the Court, they still have paper records to retrieve from**.

Our second case is that of storage virtualization. In 2010, the Active Archive Alliance was formed by Spectra Logic, FileTek, Qstar, SGI, and Compellent with the intention of educating and promoting the concept of active archiving, or extending a file system across multiple storage devices in a virtualized storage pool. With server virtualization dominating the market in 2010, it only makes sense that the virtualization trend would continue throughout the storage infrastructure. The Alliance, however, was not alone in their efforts to reintroduce the concept of seamlessly tiered storage. With data volumes growing in the Exabytes and floor space, power and cooling costs increasing, tape is the ideal resting point for generally inactive data.  Even in the era of deduplication, MAID, Thin Provisioning, and other power-saving technologies, tape continues to lead the charge for power efficiency and storage density. Why? Tape is designed to be stored offline, which consumes no power. Additionally, IBM and FUJIFILM have proven that tape is far from reaching its physical limitations for storage density with their 35TB prototype tape***. 

Ultimately, our final two trends answer tape’s role in the prior. Oracle now has an investment in tape technology through their acquisition of Sun, and thereby StorageTek’s, tape technologies. IBM, HP, Dell, and Quantum similarly have made an investment in tape technology.  Additionally, many of the loudest voices against the tape market have been acquired, some by companies with tape interests, leaving only one large player still beating the "Tape is Dead" drum: EMC.  So ask yourself... why would a marketing powerhouse spend such energy on anti-tape promotions, if it weren’t a threat on their radar?

These acquisitions have opened the airwaves for the pro-tape messaging to once again make its way into everyday dialog. Why? Because, like our final trend, it's about overall cost reduction. With tape remaining the leader in both low-cost capital expense and low-cost operational expense storage, and the integration of other technologies with tape, it is once again being discussed as a viable, valuable tier within any datacenter design. 

*http://tech.fortune.cnn.com/2011/02/28/google-goes-to-the-tape-to-get-lost-emails-back/

**http://itknowledgeexchange.techtarget.com/storage-soup/i365-involved-in-new-orleans-backup-failure/

***http://www.engadget.com/2010/01/23/ibm-and-fujifilm-develop-35tb-magnetic-tape-cartridges-unveil-i/