Cloud Storage
By Jay Kramer
Cloud computing is the latest buzz because it holds the promise of delivering computing on demand with computer and storage resources available as a shared resource on the Internet. The value of the cloud is in the virtualization and abstraction of both your physical storage resources and the location of your information assets. In other words, at its core, it is the ability to allocate performance, capacity, as well as deduplication and replication resources on-demand without regard to physical system constraints.
The best example of the cloud model has been delivered through applications like SalesForce.com where the data is available on demand through a cloud delivery system. Companies are considering cloud computing and cloud storage because they want to deliver a higher quality of service at a lower cost by consolidating and streamlining their infrastructures. They may choose to purchase cloud services from a managed service provider (MSP) in an outsourced IT model or to create their own “private” cloud infrastructure.
Storage pooling is a key requirement needed for delivering on the cloud computing model. Although it has been used successfully in primary storage environments for many years, it is a new capability in capacity-optimized (deduplicated) secondary storage. Storage pooling is a way to consolidate and manage all of a company’s physical storage resources in one place and to track and allocate those resources as needed.
With it, IT managers have the flexibility to separate data into independent pools and to allocate resources to those pools of data as needed. Storage pooling enables them to meet a variety of essential data protection requirements– from data tiering to policy management, to classes of service. For example they can provision each pool with its own disk type, deduplication configuration, replication priority, backup policies, and/or backup application. Data in the pools is kept completely separate, but managed from a single, unified management console.
So the million dollar question is - Why is storage pooling important? The following five key attributes frame the foundation of capabilities that storage pooling can deliver on for a cloud data protection solution:
1. Multi-tenancy – To be cost-effective, IT managers and MSPs have to be able to use a single backup system to protect multiple business units or customers and to allocate resources to them dynamically on-demand. Therefore, every storage pool needs to be kept secure and fully independent from the others.
2. Chargeback systems –With data protection resources allocated by end-user needs, storage providers need to track this usage by a wide range of criteria for both charge-back and billing purposes and for infrastructure optimization purposes.
3. Robust Reporting – IT managers need an accurate way to forecast their capacity and processing needs for budgeting purposes. They also need to analyze usage to optimize available system resources for better CapEx efficiencies. For example, they may be able to reduce or delay the need to buy new equipment services to protect growing data volumes. Detailed reporting and analytics not only helps in managing the current environment but also enables trending and modeling for planning future investments.
4. Quality of Service delivery – Storage pooling enables IT managers to set replication priorities for each pool so that the most mission critical data is replicated before less important data. This QoS orientation can be set to specific backup policies with different retention periods for a particular storage pool.
5. Storage Tiering – Storage managers can allocate disk drives to a storage pool according to the capacity or performance requirements for a specific set of data under protection. For example, they may want to use lower cost disk drives for low-priority disk pools; or WORM-enabled disk for decision support applications.
6. Global Deduplication – Deduplication is a critical part of an effective data protection environment. It is not only necessary for cost-effective optimization of the overall storage capacity but also provides a cost effective WAN implementation for replication and movement of data to a remote location for disaster recovery.
When highly optimized deduplication is integrated with replication, it can deliver low bandwidth WAN replication by enabling the transmission of only net new data. This capability provides a huge cost saving for customers implementing a disaster recovery strategy. The ability to isolate deduplication on a storage pool basis is achieved through a secure multi-tenancy infrastructure and allows the customer to have the best of both worlds – a highly optimized storage and replication implementation and a highly secure set of information assets that are fully protected from any co-mingling.
Storage pooling is a key technology that enables data center managers to move beyond backup and recovery and implement Data Protection Lifecycle Management (DPLM) – a more holistic approach. With it, administrators can use their backup appliance to begin to address issues that typically arise in later phases of the data lifecycle, such as archival and expiration/destruction. For example, some regulations require the long-term retention of protected data. In some cases, this retention period may be longer than the life of the hardware. Storage pools enable administrators to simply create a new storage pool using the latest array hardware and migrate data to it without a forklift upgrade or disruption of service.
Similarly, data expiration and deletion can be set at the pool level—further eliminating manual processes and ensuring timely capacity recycling for added cost savings.
Storage pools can also improve operational efficiency by enabling each backup administrator to manage a larger volume of protected data. With petabytes of data on a single, easy-to-manage system, equipped with detailed reporting and powerful management, storage pools can significantly reduce administration costs as well as data center footprint which are vital attributes for a cloud storage management deployment model.
Storage pooling is an essential part of delivering on the promise of cloud storage. It is needed to provide the flexibility and cost savings necessary to make cloud storage successful. The attraction of the cloud is compelling when it allows IT organizations to better manage the growing amounts of data that need to be available, secure and protected. With IT budgets constrained, data protection has emerged as a “Killer App” for the cloud and the solutions in the marketplace that can leverage storage pooling of deduplicated data are best positioned to address customer investments today and into the future.
Author: Jay Kramer is vice president of worldwide marketing at SEPATON, Inc.

