According to an August 2007 EPA report on server and data center energy efficiency, the energy consumed by U.S. data centers will grow from 1.5 percent to 2.5 percent of the nation’s total energy consumption over the next five years.* Additionally, a recent Gartner report states that 50 percent of data centers will run out of power or space for their data centers sometime in 2008.**
The energy requirements for data storage are enormous—as much as 40 percent of the total data center power budget. With data storage needs increasing by the hour, companies are not only running out of storage space on their servers, they are also running out of actual real estate to house these data centers. Companies around the world are now facing a true storage efficiency crisis, and without an effective solution in place, the security and space for this critical information can no longer be guaranteed.
The environmental implications of data storage are also staggering. A single rack of storage enclosures using 6 kW generates as much carbon dioxide as six 1999 Chevy Tahoe SUVs in one year (about 40 tons!). Most storage systems, which are composed of dozens if not hundreds of disks that are always on regardless of the need to retrieve data, represent the equivalent of perpetually idling an SUV in the garage on the off chance the owner might want to take a drive.*** With the
There is no question that as the demand for storage, along with the energy, environmental and economic costs associated with it, continue to rise, companies across myriad industries must explore different approaches and technologies that will enable their data centers to operate more efficiently. Today, a new generation of data storage solutions are emerging that are poised to help these companies achieve this important goal.
It’s not just about electricity
A contributing factor to the high rate of obsolescence of IT facilities is the exhaustion of space. Not only does the form factor of storage controllers and enclosures affect space, but also the underlying efficiency of the file system has tremendous impact on the rate of data center space consumption.
It is common practice for technology purchasers to procure systems that have a pre-existing footprint in their data centers. The average Fortune 1000 company has 680 terabytes of data stored on disk—60 percent on fibre channel arrays, 40 percent on SATA disk. In many cases, companies are mis-provisioning high-performing, low-capacity disk technology for their long-term, seldom accessed data. In other words, there are high-capacity disk-based alternatives for the long-term retention of data compared to fibre channel disk arrays.
Also, while file compression and related data reduction technology have been around for years, these applications have been largely confined to desktops and not for enterprise-class storage systems. By utilizing the most widely accepted non-destructive compression algorithms, many if not most of the file types that are stored in business file sharing repositories can be reduced by 50 percent, effectively doubling storage capacity without consuming a single rack unit.
Today’s collaborative style of computing creates a great deal of redundant information in the form of shared files or parts of files. This powerful and effective use of workgroup-centric storage results in significant reduction in available storage. Efficiency associated with single-instance storage, sometimes referred to as data de-duplication, is indeed widely variable based upon the data composition. Most would agree, however, that even increases of 30 percent capacity efficiency are significant enough to consider. The rise of computational virtualization has also led to wholesale duplication of core data such as operating system files, and applications.
As a final point on the issue of space, it should be recognized that applications designed to utilize storage were not necessarily designed to do so on a shared storage device. Frequently, this gives rise to ‘islands’ of storage, with multiple storage systems and their capacity ‘locked’ to specific applications, whether the application needs it or not. By some estimates, as much as 50 percent of purchased storage goes unused because of this application dependency.
What are the most viable solutions?
Today, there are technologies that can mitigate the explosion of disk-based storage and its associated physical and environmental footprint. These include storage virtualization applications, which permit the efficient aggregation of disparate storage systems into a global pool of storage that can be utilized by many applications. Data de-duplication and wire-speed compression applications can reduce data significantly by removing and referencing multiple copies of the same data. While virtualization and deduplication remain hot industry buzzwords, these solutions, when implemented independently, are unable to stand on their own as a comprehensive and cost-effective solution that is capable of adequately addressing the problem in the long-term.
Storage architectures that reduce the power consumption of disk subsystems can significantly reduce the electrical and corresponding cooling requirements of storage. Sometimes referred to as MAID (Massive Array of Idle Disks), power-state-aware storage is particularly useful in applications where data is accessed infrequently after its creation, which is just about all disk-based storage with the exception of transactional systems (e.g. databases). There are pitfalls with MAID, however, especially when integrated with the most common file system platforms. File systems tend to spread data and associated meta-data across as many disks as possible in a given storage environment—eliminating power savings even with fairly small I/O loads. Therefore, any power-managed storage architecture must be fully aware of the underlying on-disk data requirements of the controlling file system operating environment.
The whole is greater than the sum of its parts
New technology has been developed that is more efficient, cost-effective and scalable than legacy storage systems. The ‘secret recipe’ lies in the successful integration and modification of a number of synergistic software technologies into a single holistic package. These technologies include:
- Real-time, in-line block-level de-duplication, which saves energy by reducing capacity requirements in the first place.
- Power-state-aware disk sets, which permit most disks in a large array to be in an energy-saving state with predictable performance and power efficiency.
- Block-level compression, which achieves an average compression ratio of 2 to 1 on many common types of files, thus reducing capacity and related energy consumption by up to 50 percent; and remote replication, which automates the differential replication of individual users’ file systems to network-attached filers.
With the adoption of a cutting-edge solution that combines a number of the aforementioned technologies, companies will be able reap numerous benefits such as consolidating and virtualizing their storage, reducing power consumption, eliminating waste, increasing storage density, and reducing cost per terabyte.
What are the barriers to adopting more efficient storage technology?
The EPA reports that there are three major factors that are slowing the transition to power- and space-efficient data storage strategies. The first is a lack of efficiency definitions as many IT managers don’t have benchmarks on system power performance on which to make purchase decisions. The 2007 EPA report recommends the establishment of a federal and private consortium to establish such guidelines, but in the meantime, purchasers need to carefully consider the claims of products promising power savings.
The second obstacle is the split incentives of the IT purchaser and the facilities manager. Many organizations have different management structures, so the departments responsible for technology purchases are not in communication with those who pay the power bill.
And finally, the EPA study also shows that IT managers are hesitant to make changes that might negatively influence system performance and uptime, so they choose to avert the risk altogether. Technology companies therefore must be prepared with solid architectures that utilize RAID-6, asynchronous data replication, and snapshot features—integrated into the framework of the solution.
Each of these obstacles must be overcome in short order if companies are to continue on as they have.
Why make a change to more efficient systems?
In addition to the operational savings on energy, there are several other key benefits to investing in new, more efficient data storage technologies. These include: reduced labor costs for administration and maintenance; reduced hardware and software support costs by consolidation of tier-2 storage on a single application platform; and the reduction or elimination of the need to upgrade facilities. In addition, companies that do adopt a green data center strategy can enhance their public images by publishing information about their efforts to reducing energy consumption and related carbon emissions.
What Does the Future Hold?
The efficiency crisis presents a wealth of challenges and opportunities, each with a number of economic and ecological repercussions. Companies simply cannot afford to lose the mission-critical ability to store and preserve their data, but any effective, long-term solution cannot come with a price tag that puts it out of reach. To ensure a future where data storage hurts neither the environment nor a company’s bottom line, a solution is needed that businesses can easily and economically adopt that will enable them to start reducing the carbon footprint of information that continues to grow wider and deeper every day.
Sources
* EPA Report to Congress on Server and Data Center Efficiency, presented on August 2, 2007.
**November 29, 2006 press release: Gartner Says 50 Percent of Data Centers Will Have Insufficient Power and Cooling Capacity by 2008: http://www.gartner.com/it/page.jsp?id=499090
***Storage Power and Cooling Issues Heat Up, May 21, 2007, Greg Schulz,
Robert Petrocelli founded greenBytes in 2007 and serves as chief technology officer. Prior to establishing greenBytes, Petrocelli founded Heartlab Inc., a medical information technology company, which was sold to AGFA in 2005.
Petrocelli was awarded a patent on archival technology utilized for the long term storage of patient data, including cardiology images, demographics and reports. The technology was later adopted as the accepted industry standard for the storage and exchange of this type of medical information.
Petrocelli attended
