Protecting E-Mail by Software Replication

AddThis Social Bookmark Button

By Jason Buffington

There are many reasons to protect Microsoft Exchange. In fact, one could probably devote an entire article to simply building the case for Exchange protection; but instead, let’s simply list a few “whys” and move on to “how”.


- It could be argued that no application touches as many parts of an organization as Exchange. From the shipping room to the executive boardroom, almost every job function has some level of dependency on e-mail. Hence, when the e-mail server is unavailable, the entire organization is affected.

- With regulations like Sarbanes-Oxley, as well as those pertaining to financial and healthcare institutions, the retention of e-mail is becoming an ethical responsibility of one’s career. Other laws, such as E-SIGN, bind electronic agreements with the same validity as written contracts.

- And finally, while the above two examples are "internal”, most companies today rely on e-mail as part of doing business, externally. From distributing information between time zones, to coordinating a lunch location, e-mail is now often the most critical business communication for most companies.

So, the question becomes “How can I effectively and affordably protect Exchange?” Before considering solutions, one should first understand the difficulties around protecting Microsoft Exchange.

  • Exchange data is held in multiple directories with extremely large interdependent files. In even the most simple configurations, tens to hundreds of mailboxes can be stored in a single “information store” file.
  • Exchange data files are constantly in use and remain open by the application. Even if the files could be periodically closed, the 24x7 use of e-mail requires them to be accessible all of the time.
  • The above two facts combined require a “backup window” and specialized, and typically expensive, software (called backup agents) to look inside the file for traditional backup.
  • And to make matters more complex, the current versions of Microsoft Exchange (2000 and 2003) are dependent on Windows active directory. This necessitates other external information to also be protected in order to ensure the resilience of one’s e-mail system.

Collectively, it is safe to say that Microsoft Exchange is perhaps one of the most difficult applications to back up. For that reason, many IT administrators have started looking at different alternatives for Microsoft Exchange protection and availability. {mosgoogle right}

{mospagebreak}

From a “protection” perspective, tape backup is assumed. However, as one measures the time and effort required to backup windows and restore tapes, we are forced to concede that tape backup alone is insufficient—when you consider that tape backup occurs only nightly, which could result in up to an entire day of data loss should a failure occur. In the case of e-mail, much of that data loss is unrecoverable. And then, during times of crisis and restoration, recovery from tape is generally measured in hours.

For some, it is assumed that the only other available technology is synchronous mirrored storage hardware. Instead of attempting to “backup” or protect the Exchange data from an application perspective (which forces all of the complexities that were mentioned earlier), some IT administrators simply protect the storage. By providing a second storage solution and allowing the storage fabric to maintain synchronization, the data can be protected.

The positive aspect of protecting the storage (and not the application) is that the solution becomes application independent. By protecting the storage, we can protect every application with the same functionality; and not limit ourselves by “agents for Exchange” or any other application.

The negatives of synchronous storage revolve mostly around cost (including the cost of the two storage arrays) plus the fabric, controllers and synchronization software. Then add the cost of a “storage manager” or other individual with specialized storage skills. And on top of that, for any level of real distance, one must also add the cost of bandwidth—which is considerable when pushing blocks around and being dependent on a fast acknowledgment due to the nature of synchronous replication.

So the majority of us find ourselves stuck somewhere in between. We recognize that nightly tape backup is not adequate for protecting one of our most critical applications, but we can’t afford synchronous hardware. Perhaps this is why a continually growing number of companies are deploying host-based replication software.

  • In comparison to tape backup, which occurs only nightly, host-based replication software transmits changes to all the Exchange files in real time. The target copy is always only seconds behind.
  • It offers similar benefits to synchronous hardware in that it is application independent.

As it is a software-only solution, one might say that replication software “protects like synchronous disk, with costs comparable to or less than tape”. There is probably a little literary license on both sides of that phrase, but you get the idea.

Here’s how it works: Exchange (5.5, 2000 or 2003, it doesn’t matter) talks to the O/S. The O/S interacts with the file system, which eventually delegates to the storage hardware. Most replication software acts within the file system layer via a filter driver. This filter driver is able to monitor and capture all change instructions to the files and replicate these changes to the target. So, sitting below the file system, completely divorced from the application explains the “application independence” that was discussed earlier. Microsoft SQL Server and Oracle would work the same way, as do file server home directories.

Then, while the production server continues to service storage requests, a copy of the file instruction is transmitted (or “replicated”) to the target server. Since the target server already has the original data file (from an initial mirror operation performed by the replication software), only the actual file changes need to be transmitted and applied to the target. That fact alone dramatically affects the cost of the solution, since hardware-approaches tend to be block level (e.g., 64KB block), so even if only a 12-byte string is actually changed the entire 64KB block must be transmitted. This affects bandwidth costs and latency impact. By using byte-level replication, one not only achieves less latency but does so at a significantly lower operating cost (due to less bandwidth).{mosgoogle right}

{mospagebreak}

Let’s wrap this up with a few summary ideas:

  • We can all agree that Exchange (or e-mail in general) is a critical component of most infrastructures. Its protection is important for productivity, as well as perhaps regulatory compliance and simply the perceived reliability of “the network” from users.
  • Exchange is complex to protect if you are focusing internal to the application (like a backup agent). For some, protecting the physical storage is an option; however, that adds a highly significant cost and the complexity of mirrored storage.
  • One answer to consider is software-level replication. Protecting files (whether they are Exchange, Microsoft SQL Server, Oracle or even home directories) is easier than protecting applications. Handling that protection with host-based software is significantly cheaper than a proprietary storage solution.

Ten years ago, most people would not have expected electronic mail to be so prevalent. But by being open-minded to new solutions and being aware of their benefits, e-mail is the standard for correspondence today. Similarly, one might not have expected software replication to be a standard for data protection instead of tape. Today, both of these are a reality.

Jason Buffington is a Certified Business Continuity Planner, a Microsoft MCT/MCSE, and the director of business continuity for NSI Software (Hoboken, NJ)

nsisoftware.com

WACEK: SEE SIDEBAR on next page!

SIDEBAR

Imagine a scenario where an Exchange server fails at 4 p.m. on Tuesday afternoon.

As it is late in the day, replacement components may not be available until Wednesday. The repairs take place Wednesday night, and Monday evening’s tape is restored. When the users come in on Thursday morning (after not having e-mail on Wednesday), they will find everything as it was Monday night.

This scenario is not Exchange specific. In fact, regardless of the application, may it be e-mail, a database, or home directories, tape backup & restore solutions must measure their data loss and recovery windows in “days”. This is because backup typically occurs only once per night. To achieve better results, we must look at non-tape strategies.