by Yasuhiro Tai
Change is difficult…unless it’s for the better.
With shrinking budgets and reduced IT teams struggling to manage burgeoning volumes of data, organizations can no longer justify status quo storage strategies.
According to IDC, the world’s information is more than doubling every two years and will reach 8 zetabytes (8 trillion gigabytes) by 2015.
As the data grows, so does the percentage of inactive data. This generates the need for true, long-term archiving that is economically sustainable — an open solution that meets government compliance requirements while providing a lower Total Cost of Ownership (TCO), long-term data integrity and long-term data availability.
the change, an understanding of the most common pitfalls that can hinder the
long-term effectiveness of data archives is needed:
- Storing Inactive Files Online that Should Be Moved
to Nearline Storage. Analysts
estimate that less than 20 percent of an organization’s data is frequently accessed.
The remaining 80 percent is seldom accessed and access becomes less frequent
with the passage of time.
Storing and managing ever-increasing volumes of data on an organization’s primary system has become too costly. In addition to consuming precious storage space, it increases maintenance and operating costs. It also affects system performance and productivity. When hardware is added to meet growing storage requirements, the power and air conditioning costs also increase.
Moving inactive data from the primary system to a secondary, more cost-effective data archiving solution will dramatically reduce an organization’s data management costs and solve performance issues.
- Failing to Create (and Adhere to) a Formal Archiving
Strategy. A data archiving strategy is a formal set of processes for
capturing, indexing and maintaining
electronic data for a long period of time. A well-structured and properly
implemented data archival strategy can satisfy internal policies, business partner
requirements, external audits or e-discovery needs. It will define what data
will be archived, where and for how long.
Because data archiving affects more than the IT department, the archive strategy should involve communication and cooperation between the legal department, the internal auditing department and human resources as well as executive management.
Once a formal archiving strategy is in place, it is vital to follow through with regular deletion and deduplication practices. The goal is to keep historical data as long as required, but no longer.
The archiving strategy should also be reviewed at least annually to adjust for business and workflow changes.
- Using the Backup System for Data Archiving. Unlike
data backups, which are daily recordings of current files used for recovery
when loss or corruption occurs, data archiving is the
process of moving data that is not used on a regular basis to accessible, long-term
storage. This archived data is still vital and must be retained to ensure an
organization’s regulatory compliance or to keep it accessible for future
reference (i.e., legal records, for building future trend analysis, email
correspondences and other paper trails). From a simple keyword search of data, archives on random access
media can show the history of a file
or series of files, identifying where the files existed, when they existed,
even the history of who changed them and when.
One of the many benefits that data archiving provides is the reduction of backup time and recovery windows because infrequently accessed files have been moved from the primary system to safe, long-term storage.
Originally developed for short-term backup, HDDs and tape are rewritable and do not meet data compliance requirements for unalterable storage; nor do they provide the longevity required for true, long-term archiving. HDDs have a three-to five-year lifespan and tape has a lifespan of seven to 10 years, which means there is the additional cost to purchase and install new drives and the time to migrate the data from the old drives to the new ones. By contrast, optical media has been forensics-tested and proven to provide a data life of more than 50 years.
Unlike tape, which provides sequential backups, optical disks are random access media that provide indexing and search capabilities within the files for faster searches and faster access. Although it is possible to restore multiple years worth of files from a tape backup and then mine through that data for a keyword, few administrators are willing to invest the time and resources to do that twice before looking more closely at their data archiving options. Sifting through this media is expensive and time-consuming.
- Using a Volatile Rather than Non-volatile Data Archiving Solution. Because
of the relatively low device cost and high capacity of hard disk drives (HDDs)
and tape solutions, they have been promoted as viable data archive solutions. However,
both require data migration to maintain data integrity. Consequently, when the
project (and its funding) ends, the media becomes a volatile archiving solution
(data is lost when the power is removed) because they are no longer powered up
to carry out the migrations needed to preserve the data. Non-volatile
storage (NVS) solutions such as optical media do not lose their stored data
when a project and its funding ends. It continues to maintain the integrity of
its data for decades without requiring migration.
- Investing in a Storage
Medium that Cannot Ensure the Long-term Viability of the Stored Data. A key
consideration when choosing a long-term data archiving solution is how often
data will have to be migrated. HDDs are prone to operational and latent
failures, and tend to fail more rapidly when sitting unpowered on a shelf, making
them unreliable. Tape drives are also an ineffective data archiving option
because of the technology’s turnover and high maintenance requirements. Both
HDD and tape solutions are subject to degradation due to heat, humidity, dust,
mishandling, electromagnetic forces, and ordinary wear.
HDD systems and most tape solutions do not comply with government mandates for the retention and security of digital data and do not provide the long data life necessary for a true archiving solution.
- Underestimating the Total Cost of Ownership
(TCO). The TCO does not end with the hardware cost per gigabyte. The
true TCO includes the overall
operational cost — the longevity of the hardware and media, the electricity
used to power the drives, the need for 24x7 air conditioning and the human
resources required for maintenance.
With HDD and tape, the TCO also includes the manpower necessary for the time-consuming task of drive replacement and data migration as well as the significant environmental cost of the industrial waste generated.
Across the board, storage libraries are fairly comparable in price, regardless of the storage technology used. HDDs and tape have a lower cost per gigabyte than optical; however, in the 50-year lifespan of the optical media, each HDD or tape will have to be replaced at least 10 times.
HDD systems have constantly moving components. These moving components consume electricity all day, every day. According to analyst estimates, energy and power requirements for U.S. data centers more than doubled between 2006 and 2011. As electricity is consumed, heat is generated, requiring the additional expense of year-around air conditioning, which increases CO2 emissions. On the other hand, because the latest power-down Blu-ray optical-based archiving solutions only run when data is written or read, organizations can reduce their power consumption and CO2 emissions by as much as 40 percent.
- Failing to Plan for Future Data Usability.Beyond the expected lifespan of the storage
media, administrators must consider the durability of the media, the method of
writing and reading that was used to store that data, the possible effects of
the local environment on that storage media and the stored data. The security
features inherent in the chosen technology must also be considered to prevent
overwriting or deletion.
The write process for HDD and magnetic tape require the media to come in contact with the write head of the drive, increasing error and failure rates. Optical storage solutions rely on a laser to read and write, so there is no direct contact with the media. Optical media such as BD is also produced with a protective hard-coating that creates a barrier to scratches and fingerprints, giving administrators the ability to store older archives safely and indefinitely on a shelf, without compromising a single bit of data. In addition, BD media is less likely to be affected by humidity, temperature and light; and unlike magnetic tape and HDD, BD resists degenerative changes over time.
- Failing to Ensure that the Data Stored Will Be the Data Retrieved
in the Future. The length of time an organization must retain
certain records varies. For example, while healthcare organizations must retain
some data for the life of their patients plus two years, some manufacturers
must retain data for at least 30 years.
One of the greatest challenges with digital data is safeguarding its data integrity. The data retrieved 20 years from now must be identical to the data stored today, unaltered by deliberate intent, computer error, human error or viruses. The media must last as long as the retention policy.
When you remove the possibility of altering a file, you can remove any uncertainty about the integrity of the stored information or the risk of non-compliance with government retention mandates. Investing in a true write-once media format ensures compliance with federal regulations while providing support for internal business requirements.
Write-Once-Read-Many (WORM) media such as BD-R media physically prevents overwriting and, due to its long-term storage capability, it also eliminates security risks commonly associated with data migration or loss due to data/media degeneration.
Hard disks and tape, even good-quality WORM tape with a 10+ year durability rating, tend to demagnetize over time, leading to data loss. In contrast, optical storage media will last for decades.
Regardless of the storage technology chosen, to avoid a future situation in which the data you need is stored in a file format that is no longer supported, it’s a good idea to archive not only the data, but also copies of the applications (and necessary license keys) and the drive used to create the data.
- Not Weighing the Drawbacks to Cloud Storage for Data
Archiving. While it is true that
cloud services can offer many advantages, including flexibility in both capacity and performance without the need for large
up-front capital expenses, the requirements for storing each type of data
(including primary, backup, disaster recovery, and archived data) are each
unique. Consequently, before choosing a cloud service provider, it’s imperative
that you are certain that the service you select is truly protecting your data
the way it needs to be protected.
Before choosing a cloud service provider, know how they will ensure the integrity of your data. Even with their best practices, however, data can become corrupted by simply migrating it to the cloud in the first place. Beyond that, cloud storage systems are still data centers with hardware and software, and are still vulnerable to data corruption. Be aware that ultimately, the responsibility (and liability) for that data falls on the company that owns the data, not the hosting provider.
You should also have a clear understanding of exactly where the archived data is being held, what type of media they are storing it on and, if it's migrating storage, their migration schedule. You will also want to ensure there is an exit route in place just in case you want to switch providers, manage your archive internally, or they go out of business.
- Selecting a Data
Archiving Solution that Doesn’t Ensure Built-in Scalability and a Growth Path
to the Future. LTO, the latest tape technology has been praised for offering an
8-generation growth path to the future; however, while the path shows capacity
and performance increases as well as a WORM feature, the lifespan has remained
at about 10 years. Many of the LTO solutions use an automatic verify-after-write technology
that doubles the number of end-to-end passes for each scheduled backup, and
reduces the tape life by half.
Since its introduction in 2000, new generation LTOs have been released an average of every two years. Each new version writes data to a cartridge in its own generation and to a cartridge from the immediate prior generation. As a result, organizations using this format must upgrade to new drives and migrate the data from the old format to the new one about every two years.
Like any magnetic tape solution, LTO is also susceptible to degradation due to heat, humidity, dust, mishandling, electromagnetic forces and ordinary wear. In fact, one manufacturer recommends removing the media when not in use to reduce exposure to dust.
With each LTO drive costing about $10,000, LTO’s initial cost is substantially more expensive than an $80 1TB HDD and the media cost is about the same as a Blu-ray disk.
In the past, both HD and LTO tape offered higher capacities than Blu-ray; however, a single Blue-ray library with power-down technology can offer capacities that range from 1.2TB (per magazine) to hundreds of terabytes of archival storage. Blu-ray manufacturers also offer a roadmap to the future with advancements in this decade that will deliver 10 times of capacity and five times of data throughput.
Quo is an Option, but is it the Best Approach for Your Organization?
In making the decision whether or not to maintain status quo for your data archiving approach, compare your current system with the five characteristics of a true, long-term data archiving solution:
- Low Power Consumption;
- Low TCO;
- Long-term Data Integrity;
- Long-term Data Availability with Easy Access; and
- Full Compliance with Regulatory Mandates.
Then, look at the benefits that go directly to your bottom line:
- Lower Utility Bills;
- Less Time Expended on Maintenance;
- Increased Productivity/Improved Customer Relations; and
- Enhanced Ability for Building Future Trend Analysis.
The need for a sustainable true data archiving solution should be loud and clear.
Yasuhiro Tai is General Manager, Special Projects at AVC Networks Company, Panasonic Corporation.