Cloud Storage
How Hybrid Storage Solutions Address Critical Storage Challenges
By Joel Christner
Over the last two decades, we have watched Microsoft server platforms evolve from providing basic infrastructure services to become mission-critical platforms. Applications such as Microsoft Exchange Server have grown in relevance to the ability of the business to continue on its execution, and similarly, Microsoft SharePoint is starting to model this in the same way. As more users continue to rely upon these applications, the amount of storage capacity that must be managed to support them grows at unabated rates. Most companies still struggle with storage management – not just the initial deployment of the projected amount of capacity, but all of the operational aspects that go along with it. Storage requires power, data center raised floor space, cooling, and most importantly, tender loving care from IT.
It’s no wonder that the amount of storage capacity required for these applications continues to grow at unparalleled rates. These applications allow users to collaborate and share information. With Exchange and SharePoint, data is now being stored on expensive disks within arrays in the heart of the data center. Given the cost of traditional storage, one can see why the economics of traditional data center storage don’t converge with the capacity and financial requirements of a Microsoft application.
With the onset of cloud computing, and more specifically cloud storage, organizations see potential in being able to reduce cost and complexity for these key applications. The promise of capacity on-demand – being able to pay only for what is used and for what is transferred – is appealing, especially given that the price of (most) public cloud storage services are pennies on the dollar compared to traditional on-premises data center storage. That raises the question: “why haven’t more organizations adopted cloud storage for on-premises applications to alleviate the bulk of the cost associated with these growing applications?” and most organizations:
- Want to take advantage of the cloud, but don’t know how. How do I evaluate a cloud storage service, and how do I measure its effectiveness and applicability?
- Are concerned about the security of their data. What happens if my provider loses a hard disk drive? What happens if one of my employees goes rogue?
- Are rightfully worried about the availability of their data and how that impacts their day-to-day operations. What happens if a cloud storage service is offline for a period of time?
- Are concerned about being locked into their cloud storage service. Is this Hotel California (where you check out, but never leave)?
- Have legitimate concerns about application performance if the application storage is in the cloud. Will the cloud storage service satisfy my workloads?
Yet another concern that exists is “how I can get my application to even be able to use cloud storage?” Most of today’s on-premises Microsoft applications – including Exchange and SharePoint – expect to use block protocols to speak to storage. Cloud storage protocols predominantly speak only in the language of file protocols (CIFS, NFS) or using APIs (RESTful HTTP-based, or SOAP). Since these applications expect block access to storage, introducing a cloud storage system to the application is like trying to have a conversation in Mandarin when you only speak Portuguese.
A new class of solutions – hybrid storage solutions – are emerging that help address these exact challenges. A hybrid storage solution is one that:
- Is deployed in the customer’s data center
- Provides servers with access to storage using protocols that they understand
- Speaks cloud storage service protocols, and virtualizes cloud storage into usable data center capacity
- Overcomes performance limitations associated with cloud storage
- Addresses security concerns associated with cloud storage
These solutions typically include integrated storage, with either complicated caching policies or simple automated data tiering (there are fundamental differences between the two) in conjunction with application-awareness to tune the system behavior according to the application’s needs. In a sense, these solutions tend to operate like any one of the storage arrays that already exist in the data center. The difference is that these devices sit between the servers and the cloud, and allow users to control where the data is stored.
Going back to Exchange and SharePoint, these solutions help address the immediate requirement for adding capacity inexpensively. Aside from the cost of the solution hardware, the balance of the cost is based on how much capacity the users utilize in the cloud and how much data they transfer (rates vary from provider to provider). Many of these solutions also provide primary storage deduplication, which not only helps improve performance when using the cloud, but also minimize the cloud cost (since only deduplicated data would ever be read from or written to the cloud). This also helps address some of the concerns with Exchange and SharePoint regarding storage efficiency. Redundant email attachments are stored in a more space-efficient manner, meaning lower storage capacity consumption (which is especially important, as Microsoft removed the Single Instance Store for attachments in Exchange 2010 and also for using versioning and extended recycle bins in SharePoint).
Further, many of these solutions allow users to control the encryption key, and it is never shared with the cloud provider. This increases confidence, as organizations don’t need to worry about what happens if someone gets a hold of their data from the cloud, or if the cloud provider is requested to release the data – or hardware where data may reside – in support of investigation or litigation. When coupled with other security services provided by the cloud provider, including VPNs, roles-based access control, multi-factor authentication, and so on, cloud storage can be as performance-intensive and as secure as traditional on-premises storage for certain applications – such as Exchange and SharePoint.
In addition, these systems have application-awareness that helps to address specific storage-related issues that impact the ability of an organization to scale or manage an application. This can be manifest in application-aware tiering, volume data location policies, and even plug-ins that integrate with frameworks provided by the application vendor.
Finally, these solutions can help simplify data protection. Having a virtually unlimited pool of storage sitting behind an on-premises hybrid storage appliance that provides data deduplication and encryption means that – with the right application hooks and validation with data protection software vendors – users can take not only crash-consistent but also application-consistent snapshots, and store them in a space-efficient, secure format on the cloud as an always-on backup. This means that instead of having to continually fetch tape from an off-site vault, users can simply use a cloud backup copy to perform a mailbox restore, object restore from a SharePoint site, or even recover the entire application.
While these solutions are still relatively nascent, many customers are enjoying tremendous results – including lower total cost of ownership, simplified backup and restore, better disaster recovery, and the ability to confidently scale SharePoint and Exchange without concern over performance, consistent experience, and cost. Having said that, anyone considering evaluating a hybrid storage solution in conjunction with cloud storage services should consider the following:
- Which cloud services does the solution work with? Does the vendor have established partnerships? What is the support process?
- Does the solution provide deduplication and compression?
- Does the solution allow users to use their own keys to encrypt data? Are they ever shared with the cloud provider?
- Does the solution integrate with frameworks for data protection? Which operating systems and frameworks does the vendor support, and which data protection platforms?
- Which applications does the solution support?
- Does the solution automatically adjust behavior based on application? Is there manual intervention or configuration required?
- If the solution provides caching as opposed to tiering, does the cache become ‘dirty’, i.e. where the device has data that the cloud does not?
- Does the solution provide integrated high-availability with no single point of failure, like the customer’s current storage arrays? If not, how is high-availability achieved?
Joel Christner is a chief scientist at StorSimple, where he is responsible for all aspects of product management, technical marketing, solution architecture, and competitive strategy. He has a Master of Science in Computer Science from Columbia University.

