With the introduction of Exchange 2010 at the end of 2009, a native feature was added to Exchange Server for which organizations required 3rd party products before that. The feature which I am talking about is Exchange’s Personal Archives, Online Archives, or In-Place Archiving as it is called nowadays.
Archives were introduced at a time when Office 365 was in its early days, many organizations were running Exchange on-premises with mailbox quotas as bandwidth and storage were limited or relatively expensive. It was up to end users to make sure their mailbox remained within its limits, either by removing either old items, large items or just move them out of their mailbox to those pesky .PST files.
Archives introduced benefits such as lowering disk footprint by taking infrequently used items out of the primary mailbox (which then could only synchronize in full) to the archive, which is basically an additional mailbox for long-term storage. Exchange’s built-in Messaging Records Management (MRM) through retention policies and tags can be used for automatic moving of older items to the archive.
Archives also come with few downsides, especially in the early days. Most notably are perhaps clients not supporting archives at all, or searches not spanning both mailbox and archive. Also, and this is not to be underestimated, end users do not always grasp the concept of archives and the impact on the tasks and tools they use. It’s not uncommon to see people panicking about “missing data” in service tickets, only to discover their “missing data” was moved to their archive by the company retention policy after some digging.
In recent years, I have seen archives becoming less relevant, and organizations adopting the large mailbox concept in favor of lean and mean mailboxes with archives. There are still exceptions of course, usually in the form of substantial – usually shared – mailboxes. For those, staying with Exchange Online archives – and when needed auto-expanding archives – is usually still an option due to the different type of mailbox interaction, or to circumvent Exchange’s storage limitations or Outlook for Desktop’s synchronizing of offline cache files before issues might be seen. The maximum number of items per folder is such a limit, however these have been raised or done away with in recent years. Non-stubbing 3rd party archive solutions taking data out of Exchange can also be a option.
Switching to the large mailbox concept creates a problem for those organizations that have already enabled in-place archives for their end users: How to get that data back from those archives to the primary mailbox. While retention policies can move data in opposite direction, there is no such thing as a reverse-retention policy. Also, not every organization would like to instruct end users to unarchive this contents themselves, as it is prone to failure, blocks Outlook for Desktop from doing anything else and might result in abandoned operations which limits future actions as moves are still happening in the background.
When investigating a possible solution I found that there is no other way to accomplish this, than to programmatically move contents from the in-place archive to the primary mailbox. While there is a ‘archive’ operation for mailbox items (which moves it to the assigned Archive folder, not the in-place archive) there is no other single API call to perform this task. Also, the solution would have to use Exchange Web Services, as a limitation in Microsoft Graph makes it incapable of moving messages between multiple mailboxes.
Note: If I overlooked something in this area, please let me know.
To help organizations accomplish this task, I wrote a PowerShell script which requires the following:
- Exchange Server 2013 SP1 or later, or Exchange Online.
- Exchange Web Services (EWS) Managed API 2.21 or later (how to, NuGet package exchange.webservices.managed.api).
- When using OAuth, the MSAL library is required (NuGet package Microsoft.Identity.Client). Also, you need to have registered an App in Azure Active Directory; the Tenant ID, Application ID and certificate or secret is what you need to provide the script with to operate successfully.
- In addition to installing the NuGet packages, you can also store the DLLs in the same folder as the script.
Note: Untested with Primary mailboxes on-premises and Exchange Online Archives.
The script Invoke-Unarchive will perform the following tasks:
- Invoke-Unarchive will move contents from the in-place archive back to the primary mailbox.
- The most optimal operation will be chosen:
- Folders present in archive but not in primary mailbox will be moved in one operation.
- Folders present in archive and primary mailbox are merged. Items in those folders are moved in batches.
- The same steps are repeated recursively per folder for the whole archive.
- If, after moving, a folder in the archive is empty, and it is not a non-removable well-known folder, it will be removed.
- Optionally, Invoke-Unarchive can also move contents stored in the Recoverable Items from the archive to the primary mailbox.
- Invoke-Unarchive will handle throttling, either by honoring the returned back-off period or by adding delays between operations.
- Moving items is asynchronous, and Invoke-Unarchive needs to wait for Exchange to complete the previous move to folder X before it can move the next set of items to folder X.
Do not forget to reassign retention policies causing archival, or you might have the run the script again at later moment.
The parameters to call Invoke-Unarchive.ps1 are:
- Identity to specify one or more mailboxes to unarchive items for.
- Server to specify the FQDN of the Client Access Server to use. When omitted, Autodiscover will be used.
- IncludeRecoverableItems to instruct the script to process deletions stored in the Recoverable Items as well.
- Impersonation to use impersonation when accessing the mailbox. When using modern authentication (OAuth), impersonation is mandatory.
- Force to force moving of items without prompting.
- NoProgressBar to prevent progress status.
- TrustAll to accept all certificates including self-signed certificates.
- TenantId specifies the ID of the Tenant when using a mailbox hosted in Exchange Online.
- ClientId to specify the Application ID of the registered application in Azure Active Directory.
- Credentials to specify the Basic Authentication credentials for on-premises usage or against Exchange Online when OAuth is not an option.
- CertificateThumbprint is the thumbprint of the certificate to use for OAuth. The certificate with the public key needs to stored with the registered application for authentication. The certificate with the private key should be present in the local certificate store.
- CertificateFile and CertificatePassword to specify the file of the certificate to use. The file shoud contain the private key, and the password to unlock the file can be specified using CertificatePassword.
- Secret can be used to specify the secret to authenticate using the registered application.
Note that Credentials, CertificateThumbprint, CertificateFile + CertificatePassword and Secret are mutually exclusive.
Below shows an example run against a test-mailbox using modern authentication (OAuth). The common parameter Verbose is used to display additional output.
.\Invoke-Unarchive.ps1 -Identity email@example.com -Server outlook.office365.com -Impersonation -Secret <Secret> -TenantId <Tenant> -ClientId <AppId> -Verbose
You can find the script on GitHub here.
The EWS operation – especially moving items – is not necessarily slow, but against Exchange Online processing large archives can take considerable amount of time due to throttling. When moving a significant number of items using Outlook for Desktop, you will likely run into Outlook abandoning the operation after which you need to wait for Exchange to finish pending moves before you can continue with this task. Using the script, you can take away this unarchiving task from end users by running the operation in the background in one or multiple runs.