By David Fox
Cloud storage is ubiquitous in 2023, storing around 50% of corporate data at the time of writing. It’s relatively cheap, safe, compliant and requires no up-front expenditure on hardware and management. It’s also mature with Amazon’s leading AWS S3 product being introduced way back in 2006, 17 years ago!
On this basis, it might seem like a no-brainer that a media archive, by which we mean the long-term storage of video and audio data, is probably best served by cloud storage. In this article, we compare on-premise LTO tape storage with cloud storage to determine their relative merits in the archive context. How to make the right choice for your specific use case and budget?
With over 30 years of experience as a software vendor providing archive software to media, entertainment and broadcast companies, Archiware has a wealth of storage experience. LTO and Cloud are currently the only serious contenders for storing large quantities of data with indefinite retention. Spinning hard drives (HDD’s) have too short a life-span and SSD storage is too expensive to be considered for retaining Terabytes to Petabytes of data for decades.
Please also look at our blog article Comparison of LTO and Cloud Storage Costs for Media Archive, which has been updated with latest pricing data as of June 2023 and is a great additional resource to this article.
Considering Cloud Storage
Cloud storage has many headline benefits: multiple competing vendors offering subtly different products, stable pricing for some years, zero up-front capital expenditure, requires no on-premise storage infrastructure and scales infinitely and automatically. The biggest attraction with cloud is having a third-party handle all the hardware setup and maintenance, replacing ageing hardware and guaranteeing availability to levels it would be almost impossible to achieve with one’s own on-premise installation.
On paper, this is the perfect data storage, providing the costs are acceptable and the WAN/internet connections permit a reasonable speed of access. But does such storage fit the needs of a media archive?
The big downsides to using cloud storage for a long-term archive are cost and latency. The storage profile for a media archive is simply that it grows over time and is relatively infrequently accessed. For example, in the first month you might place 100TB of data into the archive and pay €100 (assuming 1TB costs €1/month), in the second month you add an additional 100TB of new data and pay €200. Each month one pays for all the data ever added to the archive plus the new data. This causes rising costs, since every TB stored has to be paid for each month. This is the fundamental cost problem when using cloud storage for long term archive. See the comparison blog article linked above for graphs showing these costs over time.
Another potential gotcha is the complex charging structures from some cloud vendors. As the headline TB/month costs go down from $20 to $1, minimum storage durations apply, minimum file/object charges apply, egress/download charges apply and time to access objects goes up.
It’s useful to consider who is responsible if data stored in the cloud is lost. Some cloud storage
The latency with cloud storage refers to the potentially slow nature of both uploading many TB’s of data to a cloud vendor and then downloading some of it again when needed in the future. This isn’t necessarily a great disadvantage since uploads can run for some hours if needed and recovery of archive data isn’t usually required in a great rush and can be planned for. The cheapest cloud offerings trade many hours waiting for data to be restored for the lowest costs.
Built into all cloud storage services are the replacement of ageing and failing hardware by the cloud vendor. During the course of maintaining on-premise storage over 20-30 years of an archive’s life, this is a huge factor that shouldn’t be overlooked. Within the cost of every TB of cloud storage, the hardware is maintained for you. If you were doing this yourself, you would be migrating data from old to new hardware every few years.
Finally consider who is liable in the event of data loss by the cloud vendor? Have they had any past issues with failures? Storing customers data in multiple locations makes this very unlikely but not impossible. Data loss can always occur and should be considered when designing the archive solution.
Considering LTO Tape
LTO Tape consists of cartridges of magnetic tape and ‘Tape Drive’ devices to read and write them. Tape drives can be standalone (tabletop) devices or housed in ‘library’, rack mounted units which house many tapes at once and automate their movement in and out of internal tape drives. A tape library with 80 slots can provide online access to 1.4 Petabytes of archive data.
Cloud vendors are known to use LTO tape, behind the scenes, to retain data for their cheapest storage services.
LTO storage means on-premise installations, with power, cooling, maintenance contracts, repairs and a skilled human to take care of all of this! If running just a single ‘standalone’ drive, then the additional work is minimal, but this scales with the capacity of storage available and the number of LTO tapes. Many of these costs are up-front and have to be covered before the first byte of data can be stored. Even though costs can be significant, when compared with cloud storage costs over several years, LTO is generally cheaper when a significant quantity of data is stored in the archive.
In return for the additional effort spent provisioning a physical LTO archive system, the organisation retains full ownership of their data. This reason alone may rule out cloud storage. The LTO archive is not beholden to a cloud vendor’s monthly bills in order that their data continues to exist. In addition, no cloud egress (download) costs are due when restoring data – these costs can be significant.
The huge cost benefit of this ownership with LTO is that each TB of data written doesn’t have to be paid for every month at the same cost, over and over again. 100TB of data archived 2 years ago to LTO, exists on tape cartridges that are already owned and paid for. That said, at some point in the lifecycle of LTO, this data should be migrated to newer tape cartridges, but this can wait many years before the first migration is necessary.
The downside to all this on-premise effort is the single point of failure that comes from having all the archive data at a single location. Therefore, it is necessary to write additional redundant copies of the data to additional tapes and move these tapes to another location to have the same level of redundancy and separation that is automatically achieved by uploading data to the cloud.
We have two competing and vastly different types of archive storage with their own unique strengths and weaknesses – which is right for you?
Consider the costs of your intended archive size, going forward over ten years. Our blog article Comparison of LTO and Cloud Storage Costs for Media Archive will illuminate here. The outcome of your cost calculations alone may be all you need to make your decision. Perhaps cloud is inexpensive and convenient for your smaller size archive needs, perhaps LTO is the only cost-effective option for your 3 Petabyte archive.
Consider the skills you have at your disposal, would you need to engage with an external partner to achieve the desired on premise LTO infrastructure or can it be built with in-house skills? Do you have the skill set to understand your preferred cloud vendor and all the nuances of the product? AWS and Azure, in particular, have complex and sophisticated offerings with tiering options and many differently priced services.
Finally, consider a best-of-both-worlds solution that is a hybrid approach, using both cloud and LTO. Perhaps LTO for archived data up to 1 year old, and then migrate to cloud when the likelihood of the data needing to be restored drops, using a low-cost cloud storage option. This eliminates the disadvantages of both technologies. Such a use-case will be very specific to the way the archive will be filled and accessed over coming years, but the best solutions are usually those that are a custom-fit!
Whether looking at pure cloud/LTO or a hybrid, investigate Archiware’s P5 Data Mover tool which allows migration and duplication of archived data between different storages. Data Mover can migrate from older tapes to newer, as part of an on premise LTO archive, or can move data from LTO to cloud in a hybrid solution. All archives require software to drive them, and the Archiware P5 Archive offering gives flexibility in supporting both cloud and LTO while allowing movement of data between them.