Combining large capacity, ready access, and instant acceleration for digital content repositories
By Gary Orenstein
The need for an Accelerated Archive
Storing and serving rapidly growing reservoirs of digital content presents significant challenges, not only in terms of storing the data, but making it easily accessible at a moment’s notice.
For example, a company offering downloadable videos or music often wants to provide the largest library possible. At the same time, users demand instant access, and if one video clip or music file becomes extremely popular the access pattern can negatively impact system response time.
Figure 1 outlines the opposing forces at work: Companies face increasing content library capacity combined with users demanding instant access and shorter responses.

Figure 1: The need for an Accelerated Archive
Conventional Tiered Storage Approach
Historically, matching large capacity content stores with instant access meant an awkward strain between technologies. To store massive amounts of data, high-capacity media such as large inexpensive disks are the first choice. However, to serve data quickly and make it readily accessible requires more expensive storage using high-performance, lower-capacity disks.
Unfortunately, neither extreme provides a suitable solution that balances performance, capacity, and cost. One conventional method to compensate for this imbalance is tiering.
The strategy behind storage tiers is to avoid investing in 100 percent top-tier storage, and to segment frequently accessed data on tier 1 while relegating less frequently accessed data on a more cost-effective tier 2.
Tiering in and of itself, however, means additional management time and oversight to implement, as shown in Figure 2.

Figure 2: Conventional tiering adds management complexity
Tier Management
Managing storage tiers involves the following activities:
- Policies and segmentation:
Administrators must estimate the capacity per tier, the triggers for migrating data from one tier down to the next, and the corresponding events to bring datasets back to higher tiers. - Backup and recovery per tier:
Each tier requires its own unique backup and recovery plan that must be planned, implemented, and tested. - Acquisition and maintenance per tier:
Each tier must be purchased, installed, and maintained. Different tiers may potentially come from different vendors.
Caching Appliances Enable Accelerated Archives
New developments in centralized storage caching now deliver a far simpler approach than conventional storage tiering with significant performance improvements.
These solutions combine the best of both worlds. Centralized storage caching complements the use of massive, inexpensive, high-capacity disk storage to preserve the never-ending growth of digital content. The caching approach addresses the end user performance requirements by making any frequently accessed piece of content readily available from RAM-based cache memory to serve thousands of users simultaneously. This is crucial to maintaining user satisfaction levels and preventing subscriber “churnâ€.

Figure 3: Building an accelerated archive with scalable caching appliances
Characteristics of an Accelerated Archive
Accelerated archives deliver the following characteristics to computing environments:
- Cache automation:
Caching automatically determines frequently accessed content and places it in high-speed cache memory. New data sets are automatically cached according to use. - Peak IOPS, low-latency, and high-throughput:
Scalable caching appliances deliver far greater IOPS than traditional disk systems because data is served from memory as opposed to slower, mechanical disk. Cache memory also delivers far lower latency and higher throughput than conventional disk systems. - Streamlined, high-capacity architecture:
By complementing a single high-capacity tier with performance caching, the overall architecture is greatly simplified. This saves on acquisition, maintenance, and management costs. - Efficient and effective cache/disk use:
By properly balancing cache and disk resources, administrators can avoid complicated tiering architectures, and still get the right balance of capacity and performance. As workloads change, intelligent cache services ensure that the cache makes the most efficient and effective use of its memory resources.
Mapping Centralized Storage Caching to Today’s Workloads
Video and music files often come to mind when talking about Web-based media distribution, and these content formats are ideally suited for accelerated archives.
The proliferation of content delivery options for consumers has expanded to include:
- Music downloads such as iTunes
- Video downloads from Netflix and others
- Photo downloads across social networking sites
- Mobile content for cell phones and PDAs
- Software downloads and updates for PCs
Each of these content types can be large, as in the case of video files, and instantly popular, driven by the viral spread of items on YouTube or other video sites. Similarly, virus definition files instantly reach peak demand during big outbreaks.
Serving the most current content typically involves maintaining, hundreds, thousands, or millions of files and sharing those to a similarly large content consumer base. During this process, clients need instant access to the requested content objects for the response time to meet customer service levels. At the same time, companies often have terabytes or petabytes of content and are in a continuous tug-of-war to broaden content and appeal to the widest audience base.
An accelerated archive automatically takes care of these needs. Upon initial use, the popular content files are maintained in cache, but as tastes shift, new popular items eventually displace the original files. In the case that tastes shift back, the original content would migrate back to cache upon first request and be available in a high-speed access mode during the period of heavy access. This cycle automatically continues to accelerate data within a large content archive as needed.
Conclusions
By combining performance caching technology with high-capacity storage, companies can easily deploy accelerated archives. These installations have the capacity and performance to put companies in a winning position to deliver instant access to large data repositories more cost-effectively and with fewer IT resources than legacy tier solutions.
Gary Orenstein is vice president of marketing at Gear6.
www.gear6.com