Restoring a personal archive from backup

Many articles discuss recovering single or multiple mailboxes from backup, but little on how to recover those personal archives (and no, recovering the mailbox doesn’t recover the personal archive, depending on your backup solution of course). I’d like to show you how to restore a personal archive using standard Exchange 2010 SP1 functionality and a backup, meaning we won’t use the dumpster and we won’t be using a lagged copy.

For our example we’ll need an archive-enabled mailbox:

image

Disaster strikes and you need to perform a full recovery of the personal archive. For completeness I’ll describe shortly how to restore a backup and create and mount the recovery database.

First, restore the database and logs from backup (you do have a backup, right?) and use an alternative location to restore the files. In this example, the effective restoration path for the DB will be called <RestoreDBPath>, the path for the logs will be <RestoreLogsPath>.

Second, create a recovery database using the following cmdlet:

New-MailboxDatabase –Recovery –Name RecoveryDB –Server <ServerID> –EdbFilePath <RecoveryDBPath> –LogFolder <RecoveryLogsPath>

Before you can mount the recovery database it might be required to bring it in a clean state. This means all logs need to be replayed, for which we use ESEUTIL in recovery mode (/r). The command to use is something as follows, where <PREFIX> is the prefix used by the database, e.g. ‘”E00”:

ESEUTIL /r <PREFIX>  /l “<RecoveryLogsPath>” /d “<RecoveryDBPath>”

image

Next, mount the database using the Exchange Management Shell as follows:

Mount-MailboxDatabase RecoveryDB

Now it’s time to restore the personal archive, for which we’ll use the New-MailboxRestoreRequest cmdlet. We’ll use the TargetIsArchive parameter to specify that the restored content should be stored in the specified mailbox’s associated personal archive. Now the trick is to specify the ArchiveGuid as SourceStoreMailbox instead of the ID (yes, having a SourceIsArchive option in the future would be nice, so we don’t need to fetch the mailbox’ ArchiveGuid first). Given this information, use the following New-MailboxRestoreRequest cmdlet to restore UserID’s personal archive:

Get-Mailbox <UserID> | % { New-MailboxRestoreRequest -SourceDatabase RecoveryDB -SourceStoreMailbox $_.ArchiveGuid -TargetMailbox $_.Identity -TargetIsArchive }

image

This will fetch UserID’s mailbox first and pass it to the New-MailboxRestoreRequest cmdlet using the required parameters. Note that, unlike Restore-Mailbox, you can’t filter on subject, timeframe, etc. You can however optionally specify a TargetFolder to restore content in a separate folder (otherwise content will be merged, like you may expect).

The restore request is queued and you can monitor progress using Get-MailboxRestoreRequest. When the restore has finished successfully, the status will be set to Completed.

image

Now let’s take a look at the mailbox’ personal archive again:

image

When you verified everything is restored, you can remove the completed restore request using Get-MailboxRestoreRequest. For example, to remove all completed restore requests in conjunction with Remove-RestoreRequest cmdlet use the following:

Get-MailboxRestoreRequest | where { $_.Status -eq “Completed”} | Remove-MailboxRestoreRequest –Confirm:$false

The above procedure is great for restoring a single personal archive, but you can also use it to recover multiple mailboxes by passing a collection of mailboxes to New-MailboxRestoreRequest like shown above, e.g.

Get-Mailbox –Database <DatabaseID> | where {$_.Name –like “p*”} |  % { New-MailboxRestoreRequest -SourceDatabase RecoveryDB -SourceStoreMailbox $_.ArchiveGuid -TargetMailbox $_.Identity -TargetIsArchive }

This will select all mailboxes on a single database (which makes sense since the recovery database will only contain the backup of a single database) and filter the selection on users with a Displayname starting with a “p”. Those users’ personal archive will be restored using RecoveryDB.

LUN design and Hardware VSS

I had a question why you need to design seperate LUNs for Exchange database and log files when using a hardware based Volume Shadow Copy Service (VSS) backup solution, as mentioned in this TechNet article:

To deploy a LUN architecture that only uses a single LUN per database, you must have a database availability group (DAG) that has two or more copies, and not be using a hardware-based Volume Shadow Copy Service (VSS) solution.

The reason for this requirement is that hardware VSS solutions operate at the hardware level, i.e. the complete LUN. Therefor, if you put the Exchange database and log files on a single LUN, it will always create a snapshot of the whole LUN. This restricts your recovery options, since you can by definition only restore that complete LUN, overwriting log files created after taking the snapshot. So, changes (log files) made after the snapshot are lost and you have no point-in-time recovery options.

For example, with the database and log files on a single LUN, suppose you create a full backup on Saturday 6:00. Then, disaster strikes on Monday. By definition, you can now only restore the database and log files as they were on Saturday 6:00; log files which were created after Saturday 6:00 are lost.

With the database and log files on separate LUNs, you can restore the database LUN, which leaves the LUN with the log files intact. Then, after restoring the database, you can start replaying log files.

So, keep this in mind when planning your Exchange LUNs in conjunction with the backup solution to be used. Note that the Mailbox Role Calculator supports this decision by letting you specify Hardware or Software VSS Backup/Restore as the Backup Methodology to be used.

If you’re interested in more background information on how VSS works, I suggest you check out this TechNet article.

Note: This blog has also been published on Exchange fellow Jaap Wesselius’ ExchangeLabs blog here.

JBOD versus RAID

After the arrival of Exchange 2010 and its DAG feature many people suggested – Microsoft included – the option to run Exchange on low-cost SATA disks in JBOD (Just Bunch Of Disks) configuration, provided you have at least 3 database copies. As you probably know, DAGs enable you to have multiple copies of mailbox databases running on multiple servers with a configurable lag per copy.

This suggestion to use JBOD, as well as the discussion of going backupless or not, isn’t without controversy. For many years people have learned to put their (critical) data on redundant storage. With Exchange 2010 this dogma is said to have changed, because contrary to its predecessors, Exchange 2010 can happily run fully supported on low-cost SATA storage in JBOD configuration. The argument used is that because you have at least 3 copies on 3 different physical servers you can survive a single failed mailbox server (likely) but also two failed mailbox servers (unlikely?).

The first problem is underestimating the limits of relying on 3+ copies. Nobody expects the Spanish inquisition 3 failures, right? But what if your hardware (e.g. disks) are from a faulty batch or contain buggy firmware, and you equipped all three of your physical servers with those parts (something burn-in tests should bring to light, but who does that these days?). I know of several occassions where drives died in pairs within hours from each other; you better make haste recovering your mailbox then. Is this perhaps a reason to look at different vendors, different parts? After all, this is more or less the same reason many businesses require multi-vendor products when reliability & security is concerned, e.g. anti-virus products from different vendors. The idea is to spread the risk.

Also, being able to use JBOD (and go backupless) looks interesting on paper, but don’t forget that – as suggested – you need to get yourself at least 3 physical servers (and no, don’t run them virtually on the same host). So, in the end this may lead to less servers (with RAID) being the most cost-effective alternative when looking at the total picture, e.g. hardware, licenses (OS, Exchange, AS/AV agents, management software, etc.) and operational costs. Why run (and maintain) many servers when the additional costs don’t outweigh the benefits?

A third element of the JBOD versus RAID discussion is the time to recover the original situation. When one of the servers fails, you should rebuild the server (hope you have some spare parts lying around or a decent replacement service contract). And after rebuilding (or restoring) the server, you need to reseed the database copies. This step may take a long time, depending on the size of the databases. Replacing a harddisk and rebuilding RAID sets is much quicker and much easier (and less prone to error).

In the end – as always – the choice should be based on business requirements. Perhaps your business can do a few hours without e-mail while IT is recovering services (can’t imagine, but you never know). In that case it’s nice to have a supported low-cost JBOD/SATA option. In my opinion, the benefits of having proper RAID setup outweighs the trouble you have to go through when repairing your JBOD based solution. Depending on these requirements, and how deep your pockets are, I’d go for a combination of RAID and DAG, where RAID is used for availability and DAG for availability (same data center) or resilience (multi data center, i.e. disaster recovery).

Oh and one other thing: when you must, use proper “Enterprise class” SATA disks; they’re made to run 24×7.