CTR Exclusives

Challenge Data Archiving Status Quo with the Facts

Challenge Data Archiving Status Quo with the Facts

by Yasuhiro Tai

Change is difficult…unless it’s for the better.

With shrinking budgets and reduced IT teams struggling to manage burgeoning volumes of data, organizations can no longer justify status quo storage strategies.

According to IDC, the world’s information is more than doubling every two years and will reach 8 zetabytes (8 trillion gigabytes) by 2015.

As the data grows, so does the percentage of inactive data. This generates the need for true, long-term archiving that is economically sustainable — an open solution that meets government compliance requirements while providing a lower Total Cost of Ownership (TCO), long-term data integrity and long-term data availability.

To justify the change, an understanding of the most common pitfalls that can hinder the long-term effectiveness of data archives is needed:

  1. Storing Inactive Files Online that Should Be Moved to Nearline Storage. Analysts estimate that less than 20 percent of an organization’s data is frequently accessed. The remaining 80 percent is seldom accessed and access becomes less frequent with the passage of time.

    Storing and managing ever-increasing volumes of data on an organization’s primary system has become too costly. In addition to consuming precious storage space, it increases maintenance and operating costs. It also affects system performance and productivity. When hardware is added to meet growing storage requirements, the power and air conditioning costs also increase.

    Moving inactive data from the primary system to a secondary, more cost-effective data archiving solution will dramatically reduce an organization’s data management costs and solve performance issues.

  2. Failing to Create (and Adhere to) a Formal Archiving Strategy. A data archiving strategy is a formal set of processes for capturing, indexing and maintaining electronic data for a long period of time. A well-structured and properly implemented data archival strategy can satisfy internal policies, business partner requirements, external audits or e-discovery needs. It will define what data will be archived, where and for how long.

    Because data archiving affects more than the IT department, the archive strategy should involve communication and cooperation between the legal department, the internal auditing department and human resources as well as executive management.

    Once a formal archiving strategy is in place, it is vital to follow through with regular deletion and deduplication practices. The goal is to keep historical data as long as required, but no longer.

    The archiving strategy should also be reviewed at least annually to adjust for business and workflow changes.

  3. Using the Backup System for Data Archiving. Unlike data backups, which are daily recordings of current files used for recovery when loss or corruption occurs, data archiving is the process of moving data that is not used on a regular basis to accessible, long-term storage. This archived data is still vital and must be retained to ensure an organization’s regulatory compliance or to keep it accessible for future reference (i.e., legal records, for building future trend analysis, email correspondences and other paper trails). From a simple keyword search of data, archives on random access media can show the history of a file or series of files, identifying where the files existed, when they existed, even the history of who changed them and when.

    One of the many benefits that data archiving provides is the reduction of backup time and recovery windows because infrequently accessed files have been moved from the primary system to safe, long-term storage.

    Originally developed for short-term backup, HDDs and tape are rewritable and do not meet data compliance requirements for unalterable storage; nor do they provide the longevity required for true, long-term archiving. HDDs have a three-to five-year lifespan and tape has a lifespan of seven to 10 years, which means there is the additional cost to purchase and install new drives and the time to migrate the data from the old drives to the new ones. By contrast, optical media has been forensics-tested and proven to provide a data life of more than 50 years.

    Unlike tape, which provides sequential backups, optical disks are random access media that provide indexing and search capabilities within the files for faster searches and faster access. Although it is possible to restore multiple years worth of files from a tape backup and then mine through that data for a keyword, few administrators are willing to invest the time and resources to do that twice before looking more closely at their data archiving options. Sifting through this media is expensive and time-consuming.

  4. Using a Volatile Rather than Non-volatile Data Archiving Solution. Because of the relatively low device cost and high capacity of hard disk drives (HDDs) and tape solutions, they have been promoted as viable data archive solutions. However, both require data migration to maintain data integrity. Consequently, when the project (and its funding) ends, the media becomes a volatile archiving solution (data is lost when the power is removed) because they are no longer powered up to carry out the migrations needed to preserve the data. Non-volatile storage (NVS) solutions such as optical media do not lose their stored data when a project and its funding ends. It continues to maintain the integrity of its data for decades without requiring migration.

  5. Investing in a Storage Medium that Cannot Ensure the Long-term Viability of the Stored Data. A key consideration when choosing a long-term data archiving solution is how often data will have to be migrated. HDDs are prone to operational and latent failures, and tend to fail more rapidly when sitting unpowered on a shelf, making them unreliable. Tape drives are also an ineffective data archiving option because of the technology’s turnover and high maintenance requirements. Both HDD and tape solutions are subject to degradation due to heat, humidity, dust, mishandling, electromagnetic forces, and ordinary wear.

    HDD systems and most tape solutions do not comply with government mandates for the retention and security of digital data and do not provide the long data life necessary for a true archiving solution.

  6. Underestimating the Total Cost of Ownership (TCO). The TCO does not end with the hardware cost per gigabyte. The true TCO includes the overall operational cost — the longevity of the hardware and media, the electricity used to power the drives, the need for 24x7 air conditioning and the human resources required for maintenance.

    With HDD and tape, the TCO also includes the manpower necessary for the time-consuming task of drive replacement and data migration as well as the significant environmental cost of the industrial waste generated.

    Across the board, storage libraries are fairly comparable in price, regardless of the storage technology used. HDDs and tape have a lower cost per gigabyte than optical; however, in the 50-year lifespan of the optical media, each HDD or tape will have to be replaced at least 10 times.

    HDD systems have constantly moving components. These moving components consume electricity all day, every day. According to analyst estimates, energy and power requirements for U.S. data centers more than doubled between 2006 and 2011. As electricity is consumed, heat is generated, requiring the additional expense of year-around air conditioning, which increases CO2 emissions. On the other hand, because the latest power-down Blu-ray optical-based archiving solutions only run when data is written or read, organizations can reduce their power consumption and CO2 emissions by as much as 40 percent.



  7. Failing to Plan for Future Data Usability.Beyond the expected lifespan of the storage media, administrators must consider the durability of the media, the method of writing and reading that was used to store that data, the possible effects of the local environment on that storage media and the stored data. The security features inherent in the chosen technology must also be considered to prevent overwriting or deletion.

    The write process for HDD and magnetic tape require the media to come in contact with the write head of the drive, increasing error and failure rates. Optical storage solutions rely on a laser to read and write, so there is no direct contact with the media. Optical media such as BD is also produced with a protective hard-coating that creates a barrier to scratches and fingerprints, giving administrators the ability to store older archives safely and indefinitely on a shelf, without compromising a single bit of data. In addition, BD media is less likely to be affected by humidity, temperature and light; and unlike magnetic tape and HDD, BD resists degenerative changes over time.

  8. Failing to Ensure that the Data Stored Will Be the Data Retrieved in the Future. The length of time an organization must retain certain records varies. For example, while healthcare organizations must retain some data for the life of their patients plus two years, some manufacturers must retain data for at least 30 years.

    One of the greatest challenges with digital data is safeguarding its data integrity. The data retrieved 20 years from now must be identical to the data stored today, unaltered by deliberate intent, computer error, human error or viruses. The media must last as long as the retention policy.

    When you remove the possibility of altering a file, you can remove any uncertainty about the integrity of the stored information or the risk of non-compliance with government retention mandates. Investing in a true write-once media format ensures compliance with federal regulations while providing support for internal business requirements.

    Write-Once-Read-Many (WORM) media such as BD-R media physically prevents overwriting and, due to its long-term storage capability, it also eliminates security risks commonly associated with data migration or loss due to data/media degeneration.

    Hard disks and tape, even good-quality WORM tape with a 10+ year durability rating, tend to demagnetize over time, leading to data loss. In contrast, optical storage media will last for decades.

    Regardless of the storage technology chosen, to avoid a future situation in which the data you need is stored in a file format that is no longer supported, it’s a good idea to archive not only the data, but also copies of the applications (and necessary license keys) and the drive used to create the data.

  9. Not Weighing the Drawbacks to Cloud Storage for Data Archiving. While it is true that cloud services can offer many advantages, including flexibility in both capacity and performance without the need for large up-front capital expenses, the requirements for storing each type of data (including primary, backup, disaster recovery, and archived data) are each unique. Consequently, before choosing a cloud service provider, it’s imperative that you are certain that the service you select is truly protecting your data the way it needs to be protected.

    Before choosing a cloud service provider, know how they will ensure the integrity of your data. Even with their best practices, however, data can become corrupted by simply migrating it to the cloud in the first place. Beyond that, cloud storage systems are still data centers with hardware and software, and are still vulnerable to data corruption. Be aware that ultimately, the responsibility (and liability) for that data falls on the company that owns the data, not the hosting provider.

    You should also have a clear understanding of exactly where the archived data is being held, what type of media they are storing it on and, if it's migrating storage, their migration schedule. You will also want to ensure there is an exit route in place just in case you want to switch providers, manage your archive internally, or they go out of business.

  10. Selecting a Data Archiving Solution that Doesn’t Ensure Built-in Scalability and a Growth Path to the Future. LTO, the latest tape technology has been praised for offering an 8-generation growth path to the future; however, while the path shows capacity and performance increases as well as a WORM feature, the lifespan has remained at about 10 years. Many of the LTO solutions use an automatic verify-after-write technology that doubles the number of end-to-end passes for each scheduled backup, and reduces the tape life by half.

    Since its introduction in 2000, new generation LTOs have been released an average of every two years. Each new version writes data to a cartridge in its own generation and to a cartridge from the immediate prior generation. As a result, organizations using this format must upgrade to new drives and migrate the data from the old format to the new one about every two years.

    Like any magnetic tape solution, LTO is also susceptible to degradation due to heat, humidity, dust, mishandling, electromagnetic forces and ordinary wear. In fact, one manufacturer recommends removing the media when not in use to reduce exposure to dust.

    With each LTO drive costing about $10,000, LTO’s initial cost is substantially more expensive than an $80 1TB HDD and the media cost is about the same as a Blu-ray disk.

    In the past, both HD and LTO tape offered higher capacities than Blu-ray; however, a single Blue-ray library with power-down technology can offer capacities that range from 1.2TB (per magazine) to hundreds of terabytes of archival storage. Blu-ray manufacturers also offer a roadmap to the future with advancements in this decade that will deliver 10 times of capacity and five times of data throughput.

Maintaining Status Quo is an Option, but is it the Best Approach for Your Organization?
In making the decision whether or not to maintain status quo for your data archiving approach, compare your current system with the five characteristics of a true, long-term data archiving solution:

  • Low Power Consumption;
  • Low TCO;
  • Long-term Data Integrity;
  • Long-term Data Availability with Easy Access; and
  • Full Compliance with Regulatory Mandates.

Then, look at the benefits that go directly to your bottom line:

  • Lower Utility Bills; 
  • Less Time Expended on Maintenance;
  • Increased Productivity/Improved Customer Relations; and
  • Enhanced Ability for Building Future Trend Analysis.

The need for a sustainable true data archiving solution should be loud and clear.

Yasuhiro Tai is General Manager, Special Projects at AVC Networks Company, Panasonic Corporation.

 

Four Essentials for Developing a Proactive Data Storage Strategy

Four Essentials for Developing a Proactive Data Storage Strategy

by Michael Oielgisser

Earlier this year, an electronic transaction processing service provider reported a security breach affecting anywhere from 50,000 to 10 million credit cardholders. Upon further inspection, a second data breach was uncovered.

That is not an isolated incident. As of July 1st, there had been 189 data breaches in 2012 in the U.S. alone, exposing about 13.73 million records. It’s not surprising, then, that many companies find themselves forced into a reactive mode.

The problem with a reactive approach is that it doesn’t address the deeper problem: the need for proactive risk management. If companies fail to plan appropriately, issues which could have been avoided through better planning can arise later. And if organizations don’t take into account core business objectives when designing a storage strategy, then they can find themselves scrambling to realign their design after it has been implemented.

Companies that handle this most effectively map out and understand pain points before they start spending money on technology solutions. When companies don’t, they often have to go back to the drawing board to invest even more capital and organizational resources to fix it.

Although it can be a harder sell up front, investing the time, money and resources to plan and implement a long-term, strategic data management approach can enable companies to stop being so reactive. The benefits of reduced reactivity can be significant. Here’s how:

Aligning with Business Objectives. The top challenge many organizations face with regard to their data storage is aligning their storage strategies with their business objectives. First, IT organizations should understand the business objectives in order to determine which technology solutions are right for the job. Otherwise, they can end up treating the wrong symptoms and not the true cause. Problems will keep re-emerging, which can perpetuate a reactive environment instead of a proactive solution.

The overall storage strategy should be designed within the context of both capital and operational expenditures. Companies should be looking at it more from the impact it has on the business. Instead, many organizations make the mistake of looking at their storage strategy just from an infrastructure perspective.

Organizations have to examine the storage strategy from where those two meet. On the business side, consider the following:

  • What are the business challenges?
  • What is the business trying to accomplish?
  • Where are the areas of risk and exposure?
  • Does the technology-spend align to budget targets?

Then, and only then, should organizations look at the strategy from the infrastructure perspective of what tools are best suited to meet those challenges.

Data Classification for the Long Term. To be proactive with data classification, it’s important to develop a strategy that allows for validation and enhancement.

How much can it help? Proper use of the correct data classifications can significantly assist in the existing information lifecycle management process, save data center storage resources, increase performance and utilization, and reduce expenses and administration overhead.

To put data classification on the right track, start by including all parties. Is there an open dialogue about data classification between IT staff, architects, application owners and management? This should build from the initial efforts towards aligning with and understanding the core business objectives. This can be a fairly involved process, but the time spent upfront can lead to a reduction in problems down the road.

In contrast, a more basic data classification approach may speed up the process at the start, but it could cost heavily later on when small errors become exponentially magnified as data volumes exponentially grow.

Avoiding the Quick Fix Quandary. It can be very frustrating if you get sidetracked from long-term data storage goals by having to continuously fix problems every day as the result of a reactive culture.

The persistent data storage problems IT deals with on a daily basis can be mitigated with a proactive approach. Addressing possible issues by mapping out a data management strategy helps reduces the likelihood of individual problems while, at the same time, establishing a proactive method for dealing with them. Failure to plan can result in the continuous use of time and resources to combat simple, yet repetitive problems.

There is a secondary issue that can stem from a reactive approach. It can put an organization in the position of having to act quickly and allowing technologies that haven’t been sufficiently tested into their environment.

The risk with applying new and cutting edge quick fixes to problems is that they often take on a life of their own over the long term. While they may solve the immediate problem, they can also create unforeseen complications down the line.

The answer is to take the time to address the issues fully before they manifest themselves. It’s well advised to stop and do a comprehensive evaluation of the possible solutions before introducing new technologies into the picture.

It’s All about Managing Risk. In uncertain economic times, managing risk is more important than ever. Today, it can be much harder to bounce back from a significant business interruption or unexpected losses. In addition, the cost of avoiding such threats is typically dramatically less than the cost of recovering from them.

A critical part of such planning lies in recognizing that your organization’s risk-preparedness must live up to recovery objectives – or risk a significant gap in business continuity that could lead to lost revenue, lost resources and lost reputation.

It’s important to create an information security and business continuity/disaster recovery agenda with an eye towards compliance, auditability, recovery objectives and partner assurance. Apply industry standards and frameworks to help assess how your company’s risk profile looks today, where it should progress to tomorrow, and what steps can be taken to get there.

Effective risk management means being proactive. It begins at the policy and strategy level to better assess, design, implement and manage your business response to unplanned events.

The Takeaway
Your organization's operations are fueled by information. The demands of information storage and data management are skyrocketing. The challenges include optimizing your storage investment while getting the best performance and business value out of your storage today – all while preparing for growth tomorrow.

Intelligent data management is increasingly focused on information policies and the value of data as established by business, compliance and security requirements. As such, a world-class storage infrastructure must anticipate a vast array of needs – well-managed growth, tiered storage, reduction of redundant data, networked storage, consolidation and virtualization, data classification, data privacy, rapid accessibility and thorough reporting capabilities, just to name a few.

In order to face these challenges head on, it’s important to be proactive. By doing so, organizational storage infrastructure will align with the business value of your data, business capacity requirements, and capital and operating expenditure considerations. And that will help keep your enterprise boat sailing through the toughest seas.

Michael Oielgisser is a storage architect at Forsythe.

So What's the Big Deal about Dedupe?

So What's the Big Deal about Dedupe?

by Wayne Salpietro

It seems that we are a society of information hoarders! After all, we are saving more and more information every year. According to IDC, we will average 35 ZB of data by 2020 – a mere eight years from now. Having that much information can be informative, a competitive advantage, costly, dangerous (legally) and a pain in the you-know-what all at the same time!

In today’s business environment, massive amounts of information make us consume power and cooling, increasingly expensive floor space, resources to manage the data and plan for (and budget for) a never-ending proliferation of storage devices in anticipation of the next year’s data growth. More and more data sources are being added to the data deluge that is confronting businesses every year. For example, the addition of video from surveillance is a huge incremental data load; add to it virtualized data storage and BYOD trends, and more data is being stored, managed and protected.

Read more...

Get the Big Picture with Big Data

Get the Big Picture with Big Data

by Vish Vishwanath

Big Data. The name alone is somewhat intimidating. Nevertheless, for organizations seeking the ability to analyze data that exceeds enterprise storage capacity, an introduction to Big Data technologies may be long overdue.

It’s time to meet Big Data
In an age where much of the information we consume is digital, the realities of storage capacity limitations are not surprising. By 2015, industry analysts are projecting a four-fold increase of data — 7.9 zettabytes, or 18 million times the digital assets currently stored by the Library of Congress.1

Read more...

Nimbus S and E Class System Evolution

Nimbus S and E Class System Evolution

by Curtis Chan

This article is the second in a two-part series. Read the first part here.

Expanding on how their product line evolved, Isakovich noted that Nimbus early on leveraged two trends in the emerging flash SSD market – namely that the flash manufacturing capacity would increase, thus lowering the cost of silicon and the emergence of flash management silicon ASICs that would allow the deployment of cost-efficient, enterprise-class SSD storage arrays without having to buy proprietary SSD devices from the larger incumbents. Coupled with the development of their HALO storage operating system, Nimbus then introduced their enterprise S-Class in early 2010 with a second generation offering in late 2011.

The current S-Class Memory System incorporates Nimbus’s latest enterprise flash modules, which take advantage of 6 Gb/s SAS connectivity and a new higher-performance flash management processor – making it 3X faster than the first generation. Advanced wear-leveling algorithms maximize flash durability in non-stop mission-critical installations. The S-Class also features a new non-blocking internal mid-plane capable of providing full line-rate 6 Gb/s performance to every flash module simultaneously. This architecture is vastly superior to expander-based systems, which funnel IO into fewer lanes, limiting performance. The S-Class also ups the number of processor cores, from 8 to 12, to enable even faster de-duplication and other data management services. The result is performance up to 800,000 4K IOps and a throughput of up to 8 GB/s.

In response to customer demand, Nimbus broadened the networking capability of the S-Class by adding native Fibre Channel and Infiniband connectivity options to its existing GbE and 10GbE interfaces. The Fibre Channel IO module provides dual 8 Gbps ports with SFP+ optics and up to 4 IO modules can be added to the S-Class for a total of eight Fibre Channel ports. The Infiniband module provides dual QDR 40 Gbps ports accommodating QSFP style cabling, and up to 4 IO modules are supported for a total of eight Infiniband ports. Both the Fibre Channel and Infiniband connectivity are backward compatible to prior speed generations, including 2 and 4 Gbps FC and 20 Gbps DDR Infiniband.

Furthermore, the Nimbus HALO operating system features iSCSI, FC, SRP, NFS, and CIFS protocol support simultaneously, enabling complete unified SAN and NAS provisioning from one easy to use system. Multi-pathing and clustering is supported, perfect for virtualization deployments and scale-out databases.

In early 2012, Nimbus again made headlines, announcing their new E-Class all-NAND flash storage system targeting enterprise-class data centers and cloud providers. They further added that it matched 15K rpm disk arrays on price but outperformed them in every other aspect.

The E-Class is a fully redundant, no-single-point-of-failure, multi-protocol solid-state storage system that scales from 10TB to 500TB in capacity in a single rack and according to Nimbus, offers higher power, cooling and rackspace efficiency compared to hard disk based offerings.

The system supports up to 500TB of capacity in one logical pool and the capacity can be provisioned using Fibre Channel, iSCSI, NFS, CIFS or SRP (SCSI RDMA Protocol), providing unified block (SAN) and file (NAS) storage in one easy-to-use system. All storage is thin-provisioned, maximizing utilization and simplifying storage capacity planning.

The E-Class excels in power efficiency, using as little as 5 watts per terabyte of capacity, which amounts to 80 percent savings on cooling costs when compared to standard 15K rpm disk drive arrays. In addition, a single industry-standard rack delivers the same capacity that would otherwise take more than three racks of 15K rpm disks. In IOps performance, the E-Class delivers in one rack the equivalent of 14 racks of 15K rpm disks – up to 800K IOps and up to 8 GB/s. These gains in rack density perfectly complement existing server, desktop and IO virtualization initiatives to maximize datacenter utilization and significantly reduce operational expenditures.

The E-Class platform comprise of a pair of redundant controllers housed in a separate chassis and up to 24 solid-state storage enclosures. Each controller can support up to four active-active I/O modules, including the connectivity choices of 1GbE, 10GbE, 40GgE, Fibre Channel or 40 and 56Gb InfiniBand. 

The HALO OS automatically detects controller and path failures to provide non-disruptive failover, making it ideal for applications such as enterprise-wide server virtualization, web infrastructure, database clusters, virtual desktop infrastructure (VDI) and high-performance computing.

Similar to high-end hard disk drive arrays, the E-Class utilizes RAID protection and supports hot-swappable flash drives, power and cooling modules as well as non-disruptive software updates.

In contrast to Nimbus's current S-Class Memory System, the new E-Class has a higher capacity point and uses dual 2U controllers, ensuring that there is no single-point-of-failure.  The E-Class storage capacity is also doubled at 10TB per 1 RU, while the S-Class stores 5TB per 1 RU.

Keeping in line with its $10/GB price target, the new E-Class starts at $149,995 for a 10TB configuration, plus $25,000 for each controller. The S-Class System has a starting price of $25,000 for a 2.5TB configuration. The following table outlines the differences between the two storage platforms.


Nimbus E and S Class Sustainable Storage Platforms Comparison

Nimbus Moving Forward

Since its inception, Nimbus Data Systems has always set the performance, scalability and efficiency bar for others to follow while creating significant barriers to entry for its competition by diverging from mainstream thinking. Launched two years ago, its second-generation S-Class unified all-flash storage platform peaked industry interest as a direct replacement for traditional disk arrays, while also eliminating the complexity of caching and tiering in the process.  At the time, other flash-vendors were more focused on performance augmentation of existing disk systems as a cache accelerator or adding to tier 0 storage.

Following suit, the recent Q1 launch of the third-generation E-Class again sets a new standard for solid state storage scalability and operating cost economics.  Enterprises and cloud providers alike, looking to lower their TCO while improving their ROI, can now contemplate infrastructure consolidation with all-flash storage systems that is on par with comparable disk- based systems. With its progressive technology innovations and the continual increase in its channel partner programs domestically and abroad, Nimbus is well-positioned to not only capitalize on the need for more cost-effective higher performing storage solutions but also the significantly greater trend towards primary storage based exclusively on solid state technology. 

Curtis Chan is a Senior Editor at Computer Technology Review and the Founder and Managing Partner at Cognitive Impact. He can be reached at curtis_chan@wwpi.com or curtis@cognitiveimpact.com.

Page 2 of 10
Sign Up for Breaking News and Top Stories in the CTR+ Newsletter (enter email below)

IT Security Journal