Removing Duplicate Items from a Mailbox


powershellLatest version: 2.41, April 18th, 2023

For those involved with Exchange migration projects or managing Exchange environments, at some point you probably have experienced the situation where people ended up with duplicate items in their mailbox. Duplicate items can be caused by many things, but most common are:

  • Synchronization tools or plug-in. Entries from the mailbox are treated as new entries and as a consequence are added to the mailbox when synchronizing information back to the mailbox, creating duplicates. In the past, I’ve seen this happening with Nokia PC Suite and Google Apps Sync for example;
  • Importing existing data. Accidental import from – for example – a PST file to a mailbox  can lead to duplicate entries.

image

When looking for a solution, you’ll probably encounter MSKB299349, “How to remove duplicate imported items in Outlook”. This article describes a manual procedure to remove duplicates entries from your calendar, contacts, inbox or other folders. Not a very helpful and labor intensive.

When continuing your search, you’ll find lots (I mean lots!) of tools and Outlook add-ins, like Vaita’s DIR or MAPILab’s Duplicate Remover. Not all this software is free (some even require payment per duplicate removal of appointments, contacts or e-mail) and some might not even work (MAPI-based tools may not work against Exchange 2013).

When you finally have selected a tool, in most cases they require installation of a piece of software and someone to perform the removal process using the tool or Outlook with add-in. When you’re an Apple shop you’ll require different tools, unless you’re running a Windows desktop somewhere (I’ll just pretend I didn’t hear you saying ‘Why don’t you install the tool on the Exchange server’).

Wouldn’t it be nice if you’d have a PowerShell script you can conveniently run from any workstation (or server) with PowerShell installed, removing those duplicate items from a user’s mailbox remotely? If the answer is yes, the Remove-DuplicateItems.ps1 script may be something for you.

Requirements
Using the Remove-DuplicateItems.p1 script requires Exchange Web Services (EWS) Managed API and for OAuth authentication the Microsoft Authentication Library (MSAL) libraries. You can install these packages from NuGet, or place their DLL’s in the same folder as the script. For an example of how to install EWS.Managed.Api from NuGet, see this article; for MSAL follow the same process but with the package titled ‘Microsoft.Identity.Client’.

Also take notice that since you’ll be processing user mailboxes, you’ll need to have full mailbox access or impersonation permissions when using Basic Authentication; the latter is preferred. For details on how to configure impersonation for Exchange On-Premises or Office 365, see this blog post. Using a registered app with OAuth is always through Impersonation.

Usage
The script Remove-DuplicateItems.ps1 uses the following syntax:

Remove-DuplicateItems.ps1 [[-Identity] ] [[-Type] ] [-Retain ] [-Server ] [-Impersonation] [-DeleteMode ] [-Credentials ] [-Mode ] [-MailboxOnly] [-ArchiveOnly] [-IncludeFolders <String[]>] [-ExcludeFolders <String[]>] [-PriorityFolders <String[]>] [-NoSize] [-CleanupMode] [-NoProgressBar] [-Force] [-WhatIf] [-Confirm] [-Secret] [-CertificateThumbprint] [-CertificateFile] [-CertificatePassword] [-TenantId] [-ClientId] [-TrustAll] [-ExchangeSchema <String>] [-NoSCP]

A quick walk-through on the parameters and switches:

  • Identity is the e-mail address or name of the mailbox to process. If name is used, it is matched against cn/SAMAccountname/email address of local AD.
  • Type determines what folders are checked for duplicates. Valid options are Mail, Calendar, Contacts, Tasks, Notes or All (Default).
  • Retain determines which item to retain by comparing last modification times. Valid options are Newest (default) or Oldest.
  • Server is the name of the Client Access Server to access for Exchange Web Services. When omitted, the script will attempt to use Autodiscover.
  • When the Impersonation switch is specified, impersonation will be used for mailbox access, otherwise the current user context will be used.
  • DeleteMode specifies how to remove messages. Possible values are HardDelete (permanently deleted), SoftDelete (use dumpster, default) or MoveToDeletedItems (move to Deleted Items folder).
  • Mode determines how items are matched. Options are Quick, which uses PidTagSearchKey and is the default mode, or Full which uses a predefined set of attributes to match items, depending on the item class:
ItemClass Criteria
Contacts File As, First Name, Last Name, Company Name, Business Phone, Mobile Phone, Home Phone, Size
Distribution List FileAs, Number of Members, Size
Calendar Subject, Location, Start & End Date, Size
Task Subject, Start Date, Due Date, Status, Size
Note Contents, Color, Size
Mail Subject, Internet Message ID, DateTimeSent, DateTimeReceived, Sender, Size
Other Subject, DateTimeReceived
  • MailboxOnly specifies you only want to process the primary mailbox of specified users. You als need to use this parameter  when running against mailboxes on Exchange Server 2007.
  • ArchiveOnly specifies you only want to process personal archives of specified users.
  • IncludeFolders specifies one or more names of folder(s) to include, e.g. ‘Projects’. You can use wildcards around or at the end to include folders containing or starting with this string, e.g. ‘Projects*’ or ‘*Project*’. To match folders and subfolders, add a trailing \*, e.g. Projects\*. This will include folders named Projects and all subfolders. To match from the top of the structure, prepend using ‘\’. Matching is case-insensitive.
  • ExcludeFolders specifies one or more folder(s) to exclude. Usage of wildcards and well-known folders identical to IncludeFolders.
    Note that ExcludeFolders criteria overrule IncludeFolders when matching folders.
  • CleanupMode specifies to cleanup duplicates per folder (Folder, default), the whole mailbox (Mailbox), or multiple mailboxes (MultiMailbox, identities specified using Identity). The first unique item encountered will be retained. For Mailbox-level cleanup, PriorityFolders can be used to give priority to retaining items in specified folders before those found in other folders.
  • PriorityFolders specifies which folders have priority over other folders, identifying items in these folders first when using MailboxWide mode. Usage of wildcards and well-known folders is identical to IncludeFolders.
  • NoSize tells script to not use size to match items in Full mode.
  • NoProgressBar prevents displaying a progress bar as folders and items are being processed.
  • Report reports individual items detected as duplicate. Can be used together with WhatIf to perform pre-analysis.
  • TrustAll can be used to accept all certificates, e.g. self-signed certificates or when accessing Exchange using endpoint with a different certificate.
  • ExchangeSchema can be used to specify the Exchange schema to use when connecting to Exchange server or Exchange Online. Defaults to Exchange2013_SP1 or Exchange2016 when -Server is specified and is ‘outlook.office365.com’ (Exchange Online endpoint).
  • NoSCP to skip SCP lookups in Active Directory for Autodiscover.

For authentication, the following parameters are available:

  • Credentials specifies credentials to use for Basic Authentication.
  • TenantId specifies the identity of the Tenant (OAuth)
  • ClientId specifies the Id of the registered application (OAuth).
  • CertificateThumbprint specifies the thumbprint of the certificate from personal store to use for authentication (OAuth).
  • CertificateFile specifies the external certificate file (pfx) to use for authentication (OAuth). This certificate needs to contain a private key; the registered application needs to contain the certificate’s public key.
  • CertificatePassword optionally specifies the password to use with the certificate file (OAuth).
  • Secret specifies the secret to use with the application (OAuth).

Few notes:

  • When MoveToDeletedItems is specified, the Deleted Items folder will be skipped;
  • When Type is omitted or set to All, all folders are scanned, including folders like Conversation History, RSS Feeds, etc.;
  • When Quick mode is used and PidTagSearchKey is missing or inaccessible, search will fall back to Full mode;
  • For more info on PidTagSearchKey, see http://msdn.microsoft.com/en-us/library/cc815908.aspx. Note that PidTagSearchKey will have duplicate values for copied objects.
  • You need to specify MailboxOnly when running against mailboxes on Exchange Server 2007 as the Exchange 2010 personal archive options in EWSare not support in Exchange 2007 mode.

Well-Known Folders
For IncludeFolders, ExcludeFolders and PriorityFolders, you can also use well-known folders using this format: #WellKnownFolderName#, e.g. #Inbox#. Supported are #Calendar#, #Contacts#, #Inbox#, #Notes#, #SentItems#, #Tasks#, #JunkEmail# and #DeletedItems#. The script uses the currently configured Well-Known Folder of the mailbox to be processed.

Patterns
Here are some examples of using pattern matching in IncludeFolders, ExcludeFolders or PriorityFolders, based on the following tree structure:

+ TopFolderA
  + FolderA
    + SubFolderA
    + SubFolderB
  + FolderB
+ TopFolderB

The following filters will match folders from the above structure:

Filter Matches
FolderA \TopFolderA\FolderA, \TopFolderB\FolderA
Folder* \TopFolderA\FolderA, \TopFolderA\FolderB, \TopFolderA\FolderA\SubFolderA, \TopFolderA\FolderA\SubFolderB
FolderA\*Folder* \TopFolderA\FolderA\SubFolderA, \TopFolderA\FolderA\SubFolderB
\*FolderA\* \TopFolderA, \TopFolderA\FolderA, \TopFolderA\FolderB, \TopFolderA\FolderA\SubFolderA, \TopFolderA\FolderA\SubFolderB, \TopFolderB\FolderA
\*\FolderA \TopFolderA\FolderA, \TopFolderB\FolderA

Usage
So, suppose you want to remove  duplicate Appointments from the calendar of mailbox migtester1 using attribute matching, moving duplicate items to the DeletedItems, using Impersonation and you want to generate extra output using Verbose. In such case, you could use the following cmdlet:

Remove-DuplicateItems.ps1 -Identity migtester1 -Type Calendar -Impersonation -DeleteMode MoveToDeletedItems -Mode Full -Verbose

image

Alternative, you can use an e-mail address and specify credentials.  This allows the script to run against mailboxes in Office 365, for example:

Remove-DuplicateItems.ps1 -Identity olrik@office365tenant.com -Type Mail -DeleteMode MoveToDeletedItems -Mode Full -Credentials (Get-Credential) -Retain Oldest

A more complex example using IncludeFolders, ExcludeFolders and PriorityFolders:

$Credentials= Get-Credential
 .\Remove-DuplicateItems.ps1 -Mailbox olrik@office365tenant.com -Server outlook.office365.com -Credentials $Credentials -IncludeFolders '#Inbox#\*','\Projects\*' -ExcludeFolders 'Keep Out' -PriorityFolders '*Important*' -CleanupMode Mailbox

This will remove duplicate items from the specified mailbox in Office365, using the following options:

  • Fixed Server FQDN – bypassing AutoDiscover.
  • Limits operation against the Well-Known Inbox folder, top Projects folder, and all of their subfolders.
  • Excluding any folder named Keep Out.
  • Duplicates are checked over the whole mailbox.
  • Priority is given to folders containing the word Important, causing items in
    those folders to be kept over items in other folders when duplicates are found.

In case you want to process multiple mailboxes, you can use a CSV file which needs to contain the Identity field. An example of how the CSV could look:

Identity
francis
philip

The cmdlet could then be something like:

Import-CSV users.csv1 | Remove-DuplicateItems.ps1 ..

Download
The script is available on GitHub here.

Feedback
Feedback is welcomed through the comments. If you got scripting suggestions or questions, do not hesitate using the contact form.

Exchange 2010 SP3 Rollup 1


Exchange 2010 LogoToday the Exchange Team released Rollup 1 for Exchange Server 2010 Service Pack 3 (KB2803727). This update raises Exchange 2010 version number to 14.3.146.0.

Here’s a list of fixes contained in this Rollup:

  • 2561346 Mailbox storage limit error when a delegate uses the manager’s mailbox to send an email message in an Exchange Server 2010 environment
  • 2729954 Can’t send voice message to a selected non-primary email address in an Exchange Server 2010 environment
  • 2750846 Information Store service crashes when you mount public folder databases on an Exchange Server 2010 server
  • 2751628 Event ID 9682 does not record the folder name when you delete a public folder in an Exchange Server 2010 environment
  • 2756460 You cannot open a mailbox that is located in a different site by using Outlook Anywhere in an Exchange Server 2010 environment
  • 2763065 Move request log is logged when you move a mailbox in an Exchange Server 2010 SP2 environment
  • 2777742 Address Book service crashes on an Exchange Server 2010 Client Access server when a server has been running for 25 days or more
  • 2781488 RPC_S_SERVER_UNAVAILABLE (0x6BA) error code when you use a MAPI or CDO-based application in an Exchange Server 2010 environment
  • 2782683 Email message that a user sends by using the “Send As” or “Send On Behalf” permission is saved only in the Sent Items folder of the sender in an Exchange Server 2010 environment
  • 2784210 Ethical wall does not function as expected when the ReportToOriginatorEnabled property is disabled in an Exchange Server 2003 and Exchange Server 2010 coexistence environment
  • 2793348 Read receipt is sent unexpectedly when you view an email message by using Outlook Web App
  • 2796490 Microsoft Exchange Information Store service crashes on an Exchange Server 2010 Mailbox server
  • 2802569 Mailbox synchronization fails on an Exchange ActiveSync device in an Exchange Server 2010 environment
  • 2803132 Delayed email message delivery on a BlackBerry mobile device after you install Update Rollup 4 for Exchange Server 2010 SP2
  • 2806602 EdgeTransport.exe process crashes on an Exchange Server 2010 Hub Transport server
  • 2814723 Server loses network connectivity and UDP ports are used up on an Exchange Server 2010 server
  • 2814847 Rapid growth in transaction logs, CPU use, and memory consumption in Exchange Server 2010 when a user syncs a mailbox by using an iOS 6.1 or 6.1.1-based device
  • 2816934 Error code 0X800CCC13 when an additional POP3 or IMAP account is used to send an email message and Outlook online mode is used to connect to an Exchange Server 2010 environment
  • 2817140 Exchange Replication service crashes intermittently in an Exchange Server 2010 environment
  • 2817852 Cyrillic characters are displayed as question marks in the “To” field of message items in the Sent Items folder in an Exchange 2010 environment
  • 2818456 Attachments are missing from an embedded message in an Exchange Server 2010 SP2 environment
  • 2822208 Unable to soft delete some messages after installing Exchange 2010 SP2 RU6 or SP3
  • 2826066 VSAPI-based antivirus software causes delayed response in an Exchange Server 2010 environment
  • 2827037 Copy of an item is created in the Version subfolder in an Exchange Server 2010 environment
  • 2833888 No items are displayed in Outlook after you install Exchange Server 2010 SP3 or Update Rollup 6 for Exchange Server 2010 SP2
  • 2840099 ArgumentOutOfRangeException exception when an EWS application creates a new MIME email in an Exchange Server 2010 environment

As of Service Pack 2 Rollup 4, its no longer required to disable/re-enable ForeFront Protection for Exchange using the fscutility to be able to install the Rollup properly. However, if you want to remain in control, you can disable ForeFront before installing the Rollup using fscutility /disable and re-enable it afterwards using fscutility /enable.

If you want to speed up the update process for systems without internet access, you can follow the procedure described here to disable publisher’s certificate revocation checking.

If you got a DAG and want to properly update the DAG members, check the instructions here.

As with any Hotfix, Rollup or Service Pack, I’d recommend to thoroughly test this rollup in a test and acceptance environment first, prior to implementing it in production.

You can download Exchange 2010 SP3 Rollup 1 here.

Exchange Virtual Conference


Exchange 2010 LogoOn January 1st, 2013 The UC Architects felllow Paul Cunningham asked around to see who wanted to participate in a “virtual conference about Exchange Server, kind of a mini-MEC but purely online, 100% free, essentially a series of webinars/screencasts about Exchange Server 2010/2013”.

This Monday saw the start of the (inaugural) Exchange Virtual Conference. Paul gathered enough well-known people from the Exchange community to have a nice line-up of speakers and interesting sessions. Unfortunately, I couldn’t commit at that time nor find time after that to create a session, so I had to pass.

Here’s an overview of the sessions, which are pre-recorded and are 45-60 minutes (lunch break material):

Note: List will be updated with links to sessions as they become available.

 

Removing Messages by Message Class


powershellLast version: 2.00, February 27th, 2021

Recently, I was asked if it is possible to remove stub items. The reason was they were going to transition to a newer version of Exchange and they wouldn’t be using the archiving solution in the new environment. When required, vendor tooling would be used to search through the existing archives.

In such cases it makes sense to remove the stubs from the mailbox, which are shortcut messages that points to a copy of the original message in the archive solution. The new environment won’t contain the required Outlook plugins or extensions to retrieve the original message from the archive using the stub, making the stub lead to a partial or empty message.

To identify stubs, one can filter on an attribute of each item, MessageClass. This attribute defines which kind of item it is (in fact, determines what form Outlook should use in order to present or process the information). Examples of MessageClass definitions are IPM.Note (regular e-mail messages), IPM.Note.EnterpriseVault.Shortcut (message archived by Enterprise Vault) or IPM.ixos-archive (message archived by Opentext/IXOS LiveLink E-Mail Archive).

To identify stubs from Outlook, add the Message Class field to your Outlook view, e.g.:

StubsOutlook

When you want to remove the stubs using Outlook, you can utilize the Advanced Find function of Outlook, but that is a very labor intensive, tedious and non-centralized per-mailbox procedure:

SearchFromOutlook

Requirements
Using the script requires Exchange Web Services (EWS) Managed API and for OAuth authentication the Microsoft Authentication Library (MSAL) libraries. You can install these packages from NuGet, or place their DLL’s in the same folder as the script. For an example of how to install EWS.Managed.Api from NuGet, see this article; for MSAL follow the same process but with the package titled ‘Microsoft.Identity.Client’.

Also take notice that since you’ll be processing user mailboxes, you’ll need to have full mailbox access or impersonation permissions when using Basic Authentication; the latter is preferred. For details on how to configure impersonation for Exchange On-Premises or Office 365, see this blog post. Using a registered app with OAuth is always through Impersonation.

Usage
The script Remove-MessagesClassItems.ps1 uses the following syntax:

Remove-MessageClassItems.ps1 [-Identity]  [-MessageClass] [-Type] [-Server ] [-Impersonation] [-DeleteMode ] [-Type] [-Before ] [-MailboxOnly] [-ArchiveOnly] [-IncludeFolders] -ExcludeFolders [-NoProgressBar] [-Force] [-WhatIf] [-Confirm] [-Secret] [-CertificateThumbprint] [-CertificateFile] [-CertificatePassword] [-TenantId] [-ClientId] [-TrustAll]

A quick walk-through on the parameters and switches:

  • Identity is the name or e-mail address of the mailbox.
  • MessageClass specifies the Message Class to remove, for example IPM.Note.EnterpriseVault.Shortcut (EnterpriseVault). You can use wildcards around or at the end to include folders containing or starting with this string, e.g. ‘IPM.ixos*’ or ‘*EnterpriseVault*’. Matching is always case-insensitive.
  • Server is the name of the Client Access Server to access for Exchange Web Services. When omitted, the script will attempt to use Autodiscover.
  • Type determines what folder class to process. Options are Mail, Calendar, Contacts, Tasks, Notes or All (Default).
  • Switch Impersonation specifies if impersonation will be used for mailbox access, otherwise the current user context will be used.
  • DeleteMode specifies how to remove messages. Possible values are HardDelete (permanently deleted), SoftDelete (use dumpster, default) or MoveToDeletedItems (move to Deleted Items folder). Note that the Deleted Items folder will be processed, unless MoveToDeletedItems is used.
  • Before can be used to only remove items received before specified date.
  • MailboxOnly specifies you only want to process the primary mailbox of specified users. You als need to use this parameter  when running against mailboxes on Exchange Server 2007.
  • ArchiveOnly specifies you only want to process personal archives of specified users.
  • IncludeFolders specifies one or more names of folder(s) to include, e.g. ‘Projects’. You can use wildcards around or at the end to include folders containing or starting with this string, e.g. ‘Projects*’ or ‘*Project*’. To match folders and subfolders, add a trailing \*, e.g. Projects\*. This will include folders named Projects and all subfolders. To match from the top of the structure, prepend using ‘\’. Matching is case-insensitive.
  • ExcludeFolders specifies one or more folder(s) to exclude. Usage of wildcards and well-known folders identical to IncludeFolders.
    Note that ExcludeFolders criteria overrule IncludeFolders when matching folders.
  • NoProgressBar prevents displaying a progress bar as folders and items are being processed.
  • ReplaceClass specifies that instead of removing the item, its PR_MESSAGE_CLASS class property will be modified to this value. For example, can be used in conjunction with MessageClass to modify any IPM.Note items pending Evault archival back to regular items, using: -MessageClass IPM.Note.EnterpriseVault.PendingArchive -ReplaceClass IPM.Note
  • Report reports individual items detected as duplicate. Can be used together with WhatIf to perform pre-analysis.
  • TrustAll can be used to accept all certificates, e.g. self-signed certificates or when accessing Exchange using endpoint with a different certificate.

For authentication, the following parameters are available:

  • Credentials specifies credentials to use for Basic Authentication.
  • TenantId specifies the identity of the Tenant (OAuth)
  • ClientId specifies the Id of the registered application (OAuth).
  • CertificateThumbprint specifies the thumbprint of the certificate from personal store to use for authentication (OAuth).
  • CertificateFile specifies the external certificate file (pfx) to use for authentication (OAuth). This certificate needs to contain a private key; the registered application needs to contain the certificate’s public key.
  • CertificatePassword optionally specifies the password to use with the certificate file (OAuth).
  • Secret specifies the secret to use with the application (OAuth).

Well-Known Folders
For IncludeFolders, ExcludeFolders, you can also use well-known folders using this format: #WellKnownFolderName#, e.g. #Inbox#. Supported are #Calendar#, #Contacts#, #Inbox#, #Notes#, #SentItems#, #Tasks#, #JunkEmail# and #DeletedItems#. The script uses the currently configured Well-Known Folder of the mailbox to be processed.

Patterns
Here are some examples of using pattern matching in IncludeFolders or ExcludeFolders, based on the following tree structure:

+ TopFolderA
  + FolderA
    + SubFolderA
    + SubFolderB
  + FolderB
+ TopFolderB

The following filters will match folders from the above structure:

Filter Matches
FolderA \TopFolderA\FolderA, \TopFolderB\FolderA
Folder* \TopFolderA\FolderA, \TopFolderA\FolderB, \TopFolderA\FolderA\SubFolderA, \TopFolderA\FolderA\SubFolderB
FolderA\*Folder* \TopFolderA\FolderA\SubFolderA, \TopFolderA\FolderA\SubFolderB
\*FolderA\* \TopFolderA, \TopFolderA\FolderA, \TopFolderA\FolderB, \TopFolderA\FolderA\SubFolderA, \TopFolderA\FolderA\SubFolderB, \TopFolderB\FolderA
\*\FolderA \TopFolderA\FolderA, \TopFolderB\FolderA

Example
Suppose you want to remove  IPM.Note.EnterpriseVault.Shortcut items from the mailbox of user1 and personal archive when enabled, moving the items to the DeletedItems by Impersonation. In such case, you could use the following cmdlet:

Remove-MessageClassItems.ps1 -Identity user1 -MessageClass IPM.Note.EnterpriseVault.Shortcut -DeleteMode MoveToDeletedItems -Impersonation –Verbose 

Note: Screenshot shows Mailbox parameter, which is per 1.52 renamed to Identity

SampleOutput

Note: By default, Remove-MessageClassItems will only search IPF.Note class folders (i.e. containing mail items), so you’ll only see those being processed. If you want all folders scanned (also classless), use the ScanAllFolders switch.

The script also supports Office 365. For example, to remove all items with ‘Enterprise’ in their message class text, received before 1/1/2014, only from the primary mailbox, excluding the folder Personal, using Basic Authentication, you can use:

$Credentials= Get-Credential
Remove-MessageClassItems.ps1 -Identity olrik@office365tenant.com -DeleteMode MoveToDeletedItems -MessageClass *EnterpriseVault* -Before 1/1/2014 -MailboxOnly -ExcludeFolder Personal -Credentials $Credentials

In case you want to process multiple mailboxes, you can use a CSV file which needs to contain the Identity field. An example of how the CSV could look:

Identity
francis
philip

The cmdlet could then be something like:

Import-CSV users.csv1 | Remove-MessageClassItems.ps1 -MessageClass IPM.Note.EnterpriseVault.Shortcut -DeleteMode HardDelete 

Feedback
You’re feedback is welcomed through the comments; if you got scripting suggestions, please use the contact form.

Download
You can download the script from the GitHub here.

MEC 2014 announced!


A quick heads-up for the ones which didn’t receive the notification via e-mail or social media: Microsoft announced the next Microsoft Exchange Conference is going to be held in April 2014:

savethedate[1]So block your agendas and plan those budgets or submit requests to management for 2014 early!