Latest version: 2.41, April 18th, 2023
For those involved with Exchange migration projects or managing Exchange environments, at some point you probably have experienced the situation where people ended up with duplicate items in their mailbox. Duplicate items can be caused by many things, but most common are:
- Synchronization tools or plug-in. Entries from the mailbox are treated as new entries and as a consequence are added to the mailbox when synchronizing information back to the mailbox, creating duplicates. In the past, I’ve seen this happening with Nokia PC Suite and Google Apps Sync for example;
- Importing existing data. Accidental import from – for example – a PST file to a mailbox can lead to duplicate entries.
When looking for a solution, you’ll probably encounter MSKB299349, “How to remove duplicate imported items in Outlook”. This article describes a manual procedure to remove duplicates entries from your calendar, contacts, inbox or other folders. Not a very helpful and labor intensive.
When continuing your search, you’ll find lots (I mean lots!) of tools and Outlook add-ins, like Vaita’s DIR or MAPILab’s Duplicate Remover. Not all this software is free (some even require payment per duplicate removal of appointments, contacts or e-mail) and some might not even work (MAPI-based tools may not work against Exchange 2013).
When you finally have selected a tool, in most cases they require installation of a piece of software and someone to perform the removal process using the tool or Outlook with add-in. When you’re an Apple shop you’ll require different tools, unless you’re running a Windows desktop somewhere (I’ll just pretend I didn’t hear you saying ‘Why don’t you install the tool on the Exchange server’).
Wouldn’t it be nice if you’d have a PowerShell script you can conveniently run from any workstation (or server) with PowerShell installed, removing those duplicate items from a user’s mailbox remotely? If the answer is yes, the Remove-DuplicateItems.ps1 script may be something for you.
Requirements
Using the Remove-DuplicateItems.p1 script requires Exchange Web Services (EWS) Managed API and for OAuth authentication the Microsoft Authentication Library (MSAL) libraries. You can install these packages from NuGet, or place their DLL’s in the same folder as the script. For an example of how to install EWS.Managed.Api from NuGet, see this article; for MSAL follow the same process but with the package titled ‘Microsoft.Identity.Client’.
Also take notice that since you’ll be processing user mailboxes, you’ll need to have full mailbox access or impersonation permissions when using Basic Authentication; the latter is preferred. For details on how to configure impersonation for Exchange On-Premises or Office 365, see this blog post. Using a registered app with OAuth is always through Impersonation.
Usage
The script Remove-DuplicateItems.ps1 uses the following syntax:
Remove-DuplicateItems.ps1 [[-Identity] ] [[-Type] ] [-Retain ] [-Server ] [-Impersonation] [-DeleteMode ] [-Credentials ] [-Mode ] [-MailboxOnly] [-ArchiveOnly] [-IncludeFolders <String[]>] [-ExcludeFolders <String[]>] [-PriorityFolders <String[]>] [-NoSize] [-CleanupMode] [-NoProgressBar] [-Force] [-WhatIf] [-Confirm] [-Secret] [-CertificateThumbprint] [-CertificateFile] [-CertificatePassword] [-TenantId] [-ClientId] [-TrustAll] [-ExchangeSchema <String>] [-NoSCP]
A quick walk-through on the parameters and switches:
- Identity is the e-mail address or name of the mailbox to process. If name is used, it is matched against cn/SAMAccountname/email address of local AD.
- Type determines what folders are checked for duplicates. Valid options are Mail, Calendar, Contacts, Tasks, Notes or All (Default).
- Retain determines which item to retain by comparing last modification times. Valid options are Newest (default) or Oldest.
- Server is the name of the Client Access Server to access for Exchange Web Services. When omitted, the script will attempt to use Autodiscover.
- When the Impersonation switch is specified, impersonation will be used for mailbox access, otherwise the current user context will be used.
- DeleteMode specifies how to remove messages. Possible values are HardDelete (permanently deleted), SoftDelete (use dumpster, default) or MoveToDeletedItems (move to Deleted Items folder).
- Mode determines how items are matched. Options are Quick, which uses PidTagSearchKey and is the default mode, or Full which uses a predefined set of attributes to match items, depending on the item class:
| ItemClass | Criteria |
| Contacts | File As, First Name, Last Name, Company Name, Business Phone, Mobile Phone, Home Phone, Size |
| Distribution List | FileAs, Number of Members, Size |
| Calendar | Subject, Location, Start & End Date, Size |
| Task | Subject, Start Date, Due Date, Status, Size |
| Note | Contents, Color, Size |
| Subject, Internet Message ID, DateTimeSent, DateTimeReceived, Sender, Size | |
| Other | Subject, DateTimeReceived |
- MailboxOnly specifies you only want to process the primary mailbox of specified users. You als need to use this parameter when running against mailboxes on Exchange Server 2007.
- ArchiveOnly specifies you only want to process personal archives of specified users.
- IncludeFolders specifies one or more names of folder(s) to include, e.g. ‘Projects’. You can use wildcards around or at the end to include folders containing or starting with this string, e.g. ‘Projects*’ or ‘*Project*’. To match folders and subfolders, add a trailing \*, e.g. Projects\*. This will include folders named Projects and all subfolders. To match from the top of the structure, prepend using ‘\’. Matching is case-insensitive.
- ExcludeFolders specifies one or more folder(s) to exclude. Usage of wildcards and well-known folders identical to IncludeFolders.
Note that ExcludeFolders criteria overrule IncludeFolders when matching folders. - CleanupMode specifies to cleanup duplicates per folder (Folder, default), the whole mailbox (Mailbox), or multiple mailboxes (MultiMailbox, identities specified using Identity). The first unique item encountered will be retained. For Mailbox-level cleanup, PriorityFolders can be used to give priority to retaining items in specified folders before those found in other folders.
- PriorityFolders specifies which folders have priority over other folders, identifying items in these folders first when using MailboxWide mode. Usage of wildcards and well-known folders is identical to IncludeFolders.
- NoSize tells script to not use size to match items in Full mode.
- NoProgressBar prevents displaying a progress bar as folders and items are being processed.
- Report reports individual items detected as duplicate. Can be used together with WhatIf to perform pre-analysis.
- TrustAll can be used to accept all certificates, e.g. self-signed certificates or when accessing Exchange using endpoint with a different certificate.
- ExchangeSchema can be used to specify the Exchange schema to use when connecting to Exchange server or Exchange Online. Defaults to Exchange2013_SP1 or Exchange2016 when -Server is specified and is ‘outlook.office365.com’ (Exchange Online endpoint).
- NoSCP to skip SCP lookups in Active Directory for Autodiscover.
For authentication, the following parameters are available:
- Credentials specifies credentials to use for Basic Authentication.
- TenantId specifies the identity of the Tenant (OAuth)
- ClientId specifies the Id of the registered application (OAuth).
- CertificateThumbprint specifies the thumbprint of the certificate from personal store to use for authentication (OAuth).
- CertificateFile specifies the external certificate file (pfx) to use for authentication (OAuth). This certificate needs to contain a private key; the registered application needs to contain the certificate’s public key.
- CertificatePassword optionally specifies the password to use with the certificate file (OAuth).
- Secret specifies the secret to use with the application (OAuth).
Few notes:
- When MoveToDeletedItems is specified, the Deleted Items folder will be skipped;
- When Type is omitted or set to All, all folders are scanned, including folders like Conversation History, RSS Feeds, etc.;
- When Quick mode is used and PidTagSearchKey is missing or inaccessible, search will fall back to Full mode;
- For more info on PidTagSearchKey, see http://msdn.microsoft.com/en-us/library/cc815908.aspx. Note that PidTagSearchKey will have duplicate values for copied objects.
- You need to specify MailboxOnly when running against mailboxes on Exchange Server 2007 as the Exchange 2010 personal archive options in EWSare not support in Exchange 2007 mode.
Well-Known Folders
For IncludeFolders, ExcludeFolders and PriorityFolders, you can also use well-known folders using this format: #WellKnownFolderName#, e.g. #Inbox#. Supported are #Calendar#, #Contacts#, #Inbox#, #Notes#, #SentItems#, #Tasks#, #JunkEmail# and #DeletedItems#. The script uses the currently configured Well-Known Folder of the mailbox to be processed.
Patterns
Here are some examples of using pattern matching in IncludeFolders, ExcludeFolders or PriorityFolders, based on the following tree structure:
+ TopFolderA
+ FolderA
+ SubFolderA
+ SubFolderB
+ FolderB
+ TopFolderB
The following filters will match folders from the above structure:
| Filter | Matches |
| FolderA | \TopFolderA\FolderA, \TopFolderB\FolderA |
| Folder* | \TopFolderA\FolderA, \TopFolderA\FolderB, \TopFolderA\FolderA\SubFolderA, \TopFolderA\FolderA\SubFolderB |
| FolderA\*Folder* | \TopFolderA\FolderA\SubFolderA, \TopFolderA\FolderA\SubFolderB |
| \*FolderA\* | \TopFolderA, \TopFolderA\FolderA, \TopFolderA\FolderB, \TopFolderA\FolderA\SubFolderA, \TopFolderA\FolderA\SubFolderB, \TopFolderB\FolderA |
| \*\FolderA | \TopFolderA\FolderA, \TopFolderB\FolderA |
Usage
So, suppose you want to remove duplicate Appointments from the calendar of mailbox migtester1 using attribute matching, moving duplicate items to the DeletedItems, using Impersonation and you want to generate extra output using Verbose. In such case, you could use the following cmdlet:
Remove-DuplicateItems.ps1 -Identity migtester1 -Type Calendar -Impersonation -DeleteMode MoveToDeletedItems -Mode Full -Verbose
Alternative, you can use an e-mail address and specify credentials. This allows the script to run against mailboxes in Office 365, for example:
Remove-DuplicateItems.ps1 -Identity olrik@office365tenant.com -Type Mail -DeleteMode MoveToDeletedItems -Mode Full -Credentials (Get-Credential) -Retain Oldest
A more complex example using IncludeFolders, ExcludeFolders and PriorityFolders:
$Credentials= Get-Credential .\Remove-DuplicateItems.ps1 -Mailbox olrik@office365tenant.com -Server outlook.office365.com -Credentials $Credentials -IncludeFolders '#Inbox#\*','\Projects\*' -ExcludeFolders 'Keep Out' -PriorityFolders '*Important*' -CleanupMode Mailbox
This will remove duplicate items from the specified mailbox in Office365, using the following options:
- Fixed Server FQDN – bypassing AutoDiscover.
- Limits operation against the Well-Known Inbox folder, top Projects folder, and all of their subfolders.
- Excluding any folder named Keep Out.
- Duplicates are checked over the whole mailbox.
- Priority is given to folders containing the word Important, causing items in
those folders to be kept over items in other folders when duplicates are found.
In case you want to process multiple mailboxes, you can use a CSV file which needs to contain the Identity field. An example of how the CSV could look:
Identity francis philip
The cmdlet could then be something like:
Import-CSV users.csv1 | Remove-DuplicateItems.ps1 ..
Download
The script is available on GitHub here.
Feedback
Feedback is welcomed through the comments. If you got scripting suggestions or questions, do not hesitate using the contact form.
I am having this issue, please help:
PS C:\Temp> .\Remove-DuplicateItems.ps1 -report -whatif -Mailbox “user@domain.com” -Server outlook.office365.com -Verbose -Type Mail -TenantId 709c9565-631d-451b-9a36-d6fac20c3884 -ClientId 2ec3541e-3619-4e76-8bc0-6fce34e3a419 -CertificateThumbprint 807CD32CD1665C550FE42DABF1435489843DA8C8
VERBOSE: Module Microsoft.Exchange.WebServices v2.2.1.0 already loaded
VERBOSE: Module Microsoft.Identity.Client v4.58.0.0 already loaded
VERBOSE: Will use certificate 807CD32CD1665C550FE42DABF1435489843DA8C8, issued by CN=EXOv3 and expiring 11/29/2024 5:00:00 PM
VERBOSE: Authentication token acquired
VERBOSE: Cleanup Mode: Folder
Processing mailbox mike.rogers@allspan.com (mike.rogers@allspan.com)
VERBOSE: Using user@domain.com for impersonation
VERBOSE: Using Exchange Web Services URL https://outlook.office365.com/EWS/Exchange.asmx
VERBOSE: Constructing folder matching rules
WARNING: Cannot bind to MsgFolderRoot: Exception calling “Bind” with “2” argument(s): “The request failed. The remote server returned an error: (403) Forbidden.”
WARNING: Cannot bind to ArchiveMsgFolderRoot: Exception calling “Bind” with “2” argument(s): “The request failed. The remote server returned an error: (403) Forbidden.”
VERBOSE: Processing user@domain.com finished
LikeLike
Registered app lacking API permissions? (needs Exchange.ManageAsApp)
Authentication Policies or Conditional Access blocking access? Having a look at the Azure Sign-In logs.
LikeLike
or this
PS C:\temp> .\Remove-DuplicateItems.ps1 -report -whatif -Mailbox “user@domain.com” -Server outlook.office365.com -Verbose -Type Mail -TenantId 709c9565-631d-451b-9a36-d6fac20c3884 -ClientId 2ec3541e-3619-4e76-8bc0-6fce34e3a419 -Secret $secret
VERBOSE: Module Microsoft.Exchange.WebServices v2.2.1.0 already loaded
VERBOSE: Module Microsoft.Identity.Client v4.58.0.0 already loaded
VERBOSE: Will use provided secret to authenticate
VERBOSE: Authentication token acquired
VERBOSE: Cleanup Mode: Folder
Processing mailbox user@domain.com (user@domain.com)
VERBOSE: Using user@domain.com for impersonation
VERBOSE: Using Exchange Web Services URL https://outlook.office365.com/EWS/Exchange.asmx
VERBOSE: Constructing folder matching rules
WARNING: Cannot bind to MsgFolderRoot: Exception calling “Bind” with “2” argument(s): “Credentials are required to make a service request.”
WARNING: Cannot bind to ArchiveMsgFolderRoot: Exception calling “Bind” with “2” argument(s): “Credentials are required to make a service request.”
VERBOSE: Processing user@domain.com finished
LikeLike
Hi !
I’m having this issue :
But suggested workaround didn’t work for me 😞
What are Microsoft.Exchange.WebServices and Microsoft.Identity.Client required/tested versions ?
Many thanks!
LikeLike
Hey Michel.
Looks like a great thing, but I’m suffering in Problems using the script removing dups from EXO-Mailboxes
After getting 401-Errors and activating the ApplicationImpersonation still getting the 401 Error.
At last I created an app to use OAuth but all I’m getting is the folling message (Some conent replaced with X 🙂
AUSFÜHRLICH: Module Microsoft.Exchange.WebServices v2.2.1.0 already loaded
AUSFÜHRLICH: Suppressed Warning Unknown category for ‘NuGet’::’GetDynamicOptions’: ‘Provider’
AUSFÜHRLICH: Suppressed Warning Unknown category for ‘NuGet’::’GetDynamicOptions’: ‘Provider’
AUSFÜHRLICH: Module Microsoft.Identity.Client v4.53.0.0 already loaded
AUSFÜHRLICH: Will use provided secret to authenticate
AUSFÜHRLICH: Authentication token acquired
AUSFÜHRLICH: Cleanup Mode: Folder
Processing mailbox m.XXXXXX@XXXXXX-technologie.de (m.XXXXXX@XXXXXX-technologie.de)
AUSFÜHRLICH: Using m.XXXXXX@XXXXXX-technologie.de for impersonation
AUSFÜHRLICH: Using Exchange Web Services URL https://outlook.office365.com/EWS/Exchange.asmx
AUSFÜHRLICH: Constructing folder matching rules
WARNUNG: Cannot bind to MsgFolderRoot: Ausnahme beim Aufrufen von “Bind” mit 2 Argument(en): “Credentials are required to make a service request.”
WARNUNG: Cannot bind to ArchiveMsgFolderRoot: Ausnahme beim Aufrufen von “Bind” mit 2 Argument(en): “Credentials are required to make a service request.”
AUSFÜHRLICH: Processing m.XXXXXX@XXXXXX-technologie.de finished
Called the script with this command:
$Secret= Read-Host ‘Secret’ -AsSecureString
.\Remove-DuplicateItems.ps1 -Server outlook.office365.com -Identity m.XXXXXX@XXXXXX-technologie.de -TenantId ‘XXXXXX-ddb4-4497-9e0b-7a7c87d1d586’ -ClientId ‘XXXXXX-e8e1-4bd2-8243-31d843c2c973’ -Secret $secret -verbose
LikeLike
Hi, i have a problem.. when call script like :
Remove-DuplicateItems.ps1 -Identity xxxxx -impersonation -Type Mail -DeleteMode MoveToDeletedItems -Mode Full -Verbose
Script ask me for: UseDefaultCredentials
LikeLike
Same here. Doesn’t matter how I try to launch the script. Same exact thing pops up every time and it wont accept any answer.
LikeLike
Hello, I’m running into what I believe is a rate limiting issue in M365. When running it on a large mailbox in our organization (115 GB split between Inbox and In-Place Archive), it processes numerous folders before eventually spitting out the following error, which continues for the remaining folders:
VERBOSE: Cleaning unique list (Finished Folder)
WARNING: Error performing operation FindItems without Search options in FOLDERNAME. Error: Exception calling “FindItems” with “1” argument(s): “The request failed. The remote server returned an
error:
WARNING: (401) Unauthorized.”
The end reports shows this
3544 items processed and 69 removed in 00:22:53 – average 155 items/min
VERBOSE: Processing EMAILADDRESS finished
If I take one of those folders throwing a 401 error and use it in -includefolders, it correctly finds and processes both the inbox and In-Place archive version of those folders.
I assume this is some sort of rate limit/timeout. I increased $script:SleepTimerMin to 10000 (10 seconds) and decreased the $MaxFolderBatchSize to 10, but this didn’t seem to affect when this triggered for the mailbox I’m testing on. This is also the case whether whatif is $True or $False
Do you have any suggestions on how to alleviate this error? I’m hoping to use this script as hands-off as possible as we’re restoring email from an external archiving system and it’s creating duplicates for anything that’s in the In-Place Archive (though can only restore to the mailbox, hence why I need to search both, as the M365 ManagedFolderAssistant does not operate predictably or reliably when started manually)
LikeLike
Still have to incorporate code for token-refreshing to prevent long running jobs from not re-authenticating when needed.
LikeLike
Great script, this seems to be an issue with our organization as well. 140gb+ online archive is timing out eventually through the script. If token refreshing is an issue, I wonder if easier to keep track of progress than just have a continuance when the script is called again?
LikeLike
I have a new tenant that requires Powershell 7 and the newer Exchange Online Management module.
I created an Entra app with the full permissions granted, and tenant id/client id / secret .
I tried following this post for o365 specific instructions:
https://www.rootmanager.com/tech-notes/using-remove-duplicateitems-script-with-microsoft-office-365.html
I ran into the following issues.
– Line 1560 of the code errors out with :
Problem loading module Microsoft.Identity.Client: Assembly with same name is already loaded
Commenting out that line allows the script to continue to run, but I then get credential errors:
Cannot bind to MsgFolderRoot: Exception calling “Bind” with “2” argument(s): “Credentials are required to make a service request.”
I’m sure commenting out line 1560 is NOT the right thing to do, and likely the cause of my subsequent credential error, but I am unsure on how to proceed.
Thanks in advance.
LikeLike
For your second error message, be sure that the -ClientId you provide to the script is the Application/client ID of the App you registered, not the Application/client ID of the tenant
LikeLike
I am setting:
-tenantID switch is the tenant, and the -clientid is the application id from the registered created app.
LikeLike
Hi, I have a problem when I try to run, maybe it’s for the language? I’m trying to remove duplicates from an office365 mailbox:
Ejecute solo los scripts de confianza. Los scripts procedentes de Internet pueden ser útiles, pero este script podría
dañar su equipo. Si confía en este script, use el cmdlet Unblock-File para permitir que se ejecute sin este mensaje de
advertencia. ¿Desea ejecutar C:tmpRemove-DuplicateItems.ps1?
[N] No ejecutar [Z] Ejecutar una vez [U] Suspender [?] Ayuda (el valor predeterminado es “N”): z
At C:tmpRemove-DuplicateItems.ps1:41 char:60
~Unexpected token ‘:”en”‘ in expression or statement.
At C:tmpRemove-DuplicateItems.ps1:41 char:65
Missing argument in parameter list.
At C:tmpRemove-DuplicateItems.ps1:41 char:542
The ‘<‘ operator is reserved for future use.
At C:tmpRemove-DuplicateItems.ps1:229 char:84
Unexpected token ‘:’ in expression or statement.
At C:tmpRemove-DuplicateItems.ps1:229 char:88
LikeLike
Don’t know what you downloaded, but it doesn’t look like a PowerShell script.
Fetch it directly from GitHub via https://raw.githubusercontent.com/michelderooij/Remove-DuplicateItems/master/Remove-DuplicateItems.ps1
LikeLike
Sorry, totally true, I left click at the github file and “download link”, and not’s the correct way…, now I have the same as others:
Problem loading module Microsoft.Exchange.WebServices: Could not load file or assembly ‘file:///C:t
mpMicrosoft.Exchange.WebServices.dll’ or one of its dependencies. The module was expected to contain an assembly manif
est.
I unlocked it, but let me try some other things, many thanks.
LikeLike
Ok, Fixed doing two things, downloading dll at the proper way (not rigth click download link, click on the file, and clik on the download button), and running powershell with administrator rights.
I configured my tenant following this guide:
https://www.rootmanager.com/tech-notes/using-remove-duplicateitems-script-with-microsoft-office-365.html
Now the script run ok, but says: credentials are required to make a service request.
LikeLike
and run the script as following:
$secret = ConvertTo-SecureString -String “xxxxxxxx” -AsPlainText -Force
.Remove-DuplicateItems.ps1 -Identity user@domain.com -Server outlook.office365.com -TenantId xxxxxx -ClientId xxxxxxxxx -Secret $secret -Report -Verbose -WhatIf
LikeLike
Now Working 100% the problem as other user said is that the secret is not the secred ID, the correct value is at the value column of the secret, many thanks to all!!!
LikeLike
Throw( ‘Problem initializing Exchange Web Services using sche …
|
~~~~~~~~~~~~~| Problem initializing Exchange Web Services using schema Exchange2016 and TimeZone India Standard Time.
Im waiting from last 3 week,could anyone will help me ?
LikeLike
Odd, [System.TimeZoneInfo]::FindSystemTimeZoneById( ‘India Standard Time’) is OK.
Are you specifying -Server outlook.office365.com ?
LikeLike
I’m still seeing quite a lot of duplicates. Sometimes the message sizes aren’t the same. Other times one e-mail is flagged or replied to, and the other are not, but they’re definitely duplicates that may have been threaded together.
Other times, there are two exact same size, subject, message ID, but those don’t get picked up either. Any suggestions? I tried mode, nosize, etc to no avail.
LikeLike
The -Mode Full should match attributes, however not all attributes are visible. Therefor you might think they are duplicates, while technically items are not. You could edit attributes considered to your liking. There is already a Body-only comparison mode I got some request for in the past. However, it’s still a dangereous operation, I recommend you edit to your liking and use the Report switch to see what it would do first.
LikeLike
Hi,
I’m trying to run your script on several mailboxes with a csv file but it doesn’t work, I get a “Specified mailbox not found” error.
Here’s the command line I use:
Import-CSV users.csv | .Remove-DuplicateItems.ps1 -Server outlook.office365.com -Type All -Retain Newest -Mode Full -Impersonation -IncludeFolders “*” -TenantId ‘***‘ –ClientId ‘***’ -Secret $Secret -TrustAll -Verbose
My csv file looks like this:
Identity
blablabla
blablabla2.blablabla
blablabla3.blablabla
blablabla4
Can you please help me ?
LikeLike
Those “blabla” are proper e-mail addresses?
LikeLike
Hello, first off, great script, thanks for it.
Should this script be able to remove duplicates found in mailbox vs archive mailbox?
If so what switch is needed to make that work?
LikeLike
CleanupMode MailboxArchive should do the trick; note that mailbox is processed before archive, so any item in the mailbox is retained over duplicates in the archive. Hope this helps.
LikeLike
Hello, this script seemed to be the perfect solution for me. Thank you for your great work. I have one tiny issue with unlimited archives, which would be perfect if there is a quick solution.
When using unlimited archives, the original folder is getting an suffix like inbox_2019 when the archive is expanding. The script doesn’t mention these folders, even not in verbose mode. So duplicates cannot be removed. An example of these folders is documented by Microsoft: Learn about auto-expanding archiving | Microsoft Learn
I have used the parameter -CleanupMode Mailbox and of course -ArchiveOnly. Any idea would be great?
LikeLike
Hello, I have a customer with over 700k emails in his mailbox and I believe there are many duplicates. I have tried running this script from Azure powershell unsuccessfully even though I copied the dlls and other files into the temporary storage on azure. I also tried it locally. What are the exact components on a Windows 11 powershell 5.1 PC I need to install to successfully run this script. -Thanks
LikeLike
Those are the parts, as well as creating an app in Entra with proper permissions, and cert/secret for OAuth authentication.
LikeLike
First I want to say thank you so much for this incredible script. It has been such a lifesaver while trying to salvage a botched MigrationWiz project. I’ve re-imported and deduplicated hundreds of GBs worth of data now, but I’m running into a bit of a strange problem. One of the larger mailboxes in particular the script has stopped actually deleting the duplicates it finds.
We’ve already removed hundreds of thousands of emails from this mailbox, but it seems like it’s hit a point where it finds around 180k duplicates, works through deleting around 30k of them before the access token expires, but the item count in that folder never goes down. We run the script again, it finds the same 180k duplicates, makes it through roughly 30k deletes, but again nothing is actually removed from the mailbox. The delete mode flag is set to hard delete. No errors are reported by the script (that we can see at least). Several runs over several days have all produced the same result.
Any ideas what could be happening or how to troubleshoot it?
LikeLike
The delete operation is asynchronous, so it might take a while before you see items getting removed (from view in Outlook, for example).
That said, removal is slow (throttled) and if you run it again that delete might still be in progress, in which case the folder is still locked, leading to other issues. I still have a work item to have a look at this as well as add the token refresh to prevent exiting because of this expiration.
LikeLike
First have to say this is a awesome solution, however ever since Microsoft required OAuth for their o365 stuff it has become a pain in the rear.
Still getting bind errors against O365.
Running Powershell 5.1
Example Session Error:
.Remove-DuplicateItems.ps1 -Identity demo1@testxxxxx.com -Server outlook.office365.com -TenantId ‘<id>’ -ClientId ‘<id>’ -Secret $secret -Report -Verbose -WhatIf
VERBOSE: Module Microsoft.Exchange.WebServices v2.2.1.0 already loaded
VERBOSE: Module Microsoft.Identity.Client v4.25.0.0 already loaded
VERBOSE: Will use provided secret to authenticate
VERBOSE: Authentication token acquired
VERBOSE: Cleanup Mode: Folder
Processing mailbox demo1@testxxxxx.com (demo1@testxxxxx.com)
VERBOSE: Using demo1@testxxxxx.com for impersonation
VERBOSE: Using Exchange Web Services URL https://outlook.office365.com/EWS/Exchange.asmx
VERBOSE: Constructing folder matching rules
WARNING: Cannot bind to MsgFolderRoot: Exception calling “Bind” with “2” argument(s): “Credentials are required to make a service request.”
WARNING: Cannot bind to ArchiveMsgFolderRoot: Exception calling “Bind” with “2” argument(s): “Credentials are required to make a service request.”
VERBOSE: Processing demo1@testxxxxx.com finished
Reading the comments and responses above, the binding error may have somthing to do with the Managed Api. Having followed your guide to use your DLL and when trying to install the other Powershell Exchange.WebServices.Managed.Api package from the nuget link, Powershell can’t find it.
PS C:> Register-PackageSource -provider NuGet -name nugetRepository -location https://www.nuget.org/api/v2
Name ProviderName IsTrusted Location
—- ———— ——— ——–
nugetRepository NuGet False https://www.nuget.org/api/v2
PS C:> Install-Package Exchange.WebServices.Managed.Api
Install-Package : No match was found for the specified search criteria and package name ‘Exchange.WebServices.Managed.Api’. Try Get-PackageSource to see all available registered package sources.
At line:1 char:1
Install-Package Exchange.WebServices.Managed.Api
~~~~~~~~~~~~Thus not sure what else it could be.
LikeLike
The token is acquired, so I’d suspect access blockage (eg conditional access), lacking permissions (EWS, thus app permission Exchange.ManageAsApp) or authentication (no policy specified blocking EWS access?). I’d also look in the sign-in logs to look for possible hints.
LikeLike
Hello, i have Exchange OnPrem 2019, im using this script but i get this a lot:
WARNING: EWS operation failed, server busy – will retry laterWARNING: Previous EWS operation failed, adjusted sleep timer to 51100ms
Is there a way to optimize this?
Thanks, great script!!
LikeLike
For on-premises, it depends. You need to remove (or raise the limits of) the throttling policy, e.g.
New-ThrottlingPolicy Untrottled
Set-ThrottlingPolicy Untrottled -RCAMaxConcurrency $null -EWSMaxConcurrency $null -EWSMaxSubscriptions $null -CPAMaxConcurrency $null -EwsCutoffBalance $null -EwsMaxBurst $null -EwsRechargeRate $null
Set-ThrottlingPolicyAssociation -Identity -ThrottlingPolicy Unthrottled
Note that throttling is there to prevent operations like this from taking down a server or affecting servicing (other) users. Setting EWSMaxConcurrency to a higher number (default is 27) is therefor preferred over setting it to unlimited (null)
LikeLiked by 1 person
Hi, and thanks for a script that can save us working with IT endless time on manual work. I have tested the script and it finds quite a lot of duplicates, but not all. I run it like this:
$secret = ConvertTo-SecureString “xxxxxxxxxxxxxxxxxxxx” -AsPlainText -Force
.Remove-DuplicateItems.ps1 -Server outlook.office365.com -identity nn@domain.no -Type Mail -DeleteMode MoveToDeletedItems -verbose -Retain Oldest -Mode Full -NoSize -TenantId xxxxxxxxxxxxxxxxxxxxxxx -ClientId xxxxxxxxxxxxxxxxxxxxxxx -Secret $Secret
Now, our problem right now is that we had a full restore from a backupsolution to some users with existing emails in some folders. The restored items seems to have removed the InternetMessageId, so these are not considered a duplicate. I have tried to comment out the line in the script where this attribute is checked (I think):
switch ($Item.ItemClass) {
‘IPM.Note’ {
# if ($Item.InternetMessageId) { [void]$keyElem.Add( $Item.InternetMessageId)}
But that doesnt seem to do the trick either. But my knowledge in PS scripting is quite limited as well:)
I also tried the -Type Body but then I get the error message:
WARNING: Error performing operation FindItems without Search options in Innboks. Error: Exception calling “FindItems”with “1” argument(s): “The property Body can’t be used in FindItem requests.”VERBOSE: Cleaning unique list (Finished Folder)
As mentioned above, with Full mode it works.
When using a third party tool in th Outlook itself only looking at sender, reciever, subject and text(body) it finds the duplicates, but it would be lovely to be able to do this “server side” like your script.
Any thoughts on how to move forward here?
I read that you are planning on “working on version where you can customize the fields used for matching”. That would be awesome:)
LikeLike
Commenting that line should remove it from the equation when using attribute-level matching. Any chance the restore also modified some other attributes (which causes for example to affect the item size and thus that value as well)?
LikeLike
Thank you for your reply! I have not seen any other modifications on the duplicates from the restore, and I use the -NoSize switch to get away with the potential difference in size, as I understand is the meaning of that switch.
The strange thing is that the mode Body doesn’t seem to work either? At least there is an error:
WARNING: Error performing operation FindItems without Search options in Innboks. Error: Exception calling “FindItems” with “1” argument(s): “The property Body can’t be used in FindItem requests.” VERBOSE: Cleaning unique list (Finished Folder)
I understand that it might be a little aggressive approach to find duplicates, but I only use this script in a testing environment for now.
LikeLike
Hi Erik, Should be fixed in 2.45 now – see https://github.com/michelderooij/Remove-DuplicateItems/issues/10
LikeLike
Trying to run against O365 and keep getting this error:
Join-Path : Cannot bind argument to parameter ‘Path’ because it is an empty string.
At C:remove-DuplicateItems.ps1:842 char:60
$mysecure = Read-Host -AsSecureString
c:remove-DuplicateItems.ps1 -Identity kdavissmall@bvapllc.com -Type Mail -Server outlook.office365.com -TenantId ‘xxxx’ -ClientId ‘xxxx’ -Secret $mysecure
Any advice?
LikeLike
The “install path” of either Microsoft.Exchange.WebServices or Microsoft.Identity.Client package is empty, hence the error – which shouldn’t happen, but here we are.
Could you run with “-Verbose” and post the output which module it is as well as how you installed it?
LikeLike
Dear Michel,
Thank you for the -verbose suggestion, it worked.
But after that I got below error message:
Autodiscover failed: Exception calling “AutodiscoverUrl”
with “2” argument(s): “The Autodiscover service couldn’t be located.”
Any advice please?
LikeLike
Set up Autodiscover properly, or use -Server to manually set the hostname to talk to (for O365, it’s outlook.office365.com)
LikeLike
Hi Michel,
I am having an issue where i am inputting a list of included folders (IncludeFolders). A folder has a hyphen in it, and it seems to stop looking at the next folders in the list thereafter. There are no errors but just wondered if there was a workaround?
Cheers.
LikeLike
Does the folder contain the hyphen or your search pattern? Sure it’s hyphen (-), not em dash (—) or en dashes (–)?
LikeLike
I don’t think this was the issue but thanks for replying. It was the more than 100 folder setting I needed to tweak.
I can see if there is 22 items in a folder it’s detecting 44. Any reason why this would be?
LikeLike
Thanks for catching that – counted items double (‘# items processed’). Fixed (2.46)
LikeLike
Thank you Michel – that has resolved it.
LikeLike
Hi Michel,
just trying to get at least connected to a test tenant to get more in touch with your script.
Unfortunately I do get error:
.Remove-DuplicateItems.ps1 -Identity AdeleV@M365x32648601.OnMicrosoft.com -Type Mail -Report -ClientId “XXX” -TenantId “XXX” -Secret (ConvertTo-SecureString “[SecureString]” -AsPlainText -Force) -verbose -whatif
Error:
ERBOSE: Module Microsoft.Exchange.WebServices v2.2.1.0 already loadedVERBOSE: Module Microsoft.Identity.Client v4.53.0.0 already loadedVERBOSE: Will use provided secret to authenticateException calling “Create” with “1” argument(s): “Could not load file or assembly ‘Microsoft.IdentityModel.Abstractions, Version=6.22.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35’ or one of its dependencies. The system cannot find the filespecified.”At D:OneDriveOneDrive – RewionscriptsScripteRemove-DuplicateItems.ps1:1653 char:13
~~~~~~~~~~~~~D:OneDriveOneDrive – RewionscriptsScripteRemove-DuplicateItems.ps1 : Problem acquiring token: You cannot call a method on a null-valued expression.At line:1 char:1
FullyQualifiedErrorId : Microsoft.PowerShell.Commands.WriteErrorException,Remove-DuplicateItems.ps1
.Remove-DuplicateItems.ps1 -Identity AdeleV@M365x32648601.OnMicrosof …
~~~~~~~~~~~~~~~~~CategoryInfo : NotSpecified: (:) [Write-Error], WriteErrorException
Any idea why this happens?
Thank you,
LikeLike
Hi Sebastian, did you run the scripts from a “plain” PowerShell session, i.e. not with Exchange Online Management loaded or using an Exchange (on-premises) Management Shell session?
The script might barf because those Exchange modules already some of the modules the scripts tries to load. This may seem good, but they have dependencies on / use other versions of those modules.
Also, when using the dll versions of the modules, run from an elevated session, as might be prohibited in your environment.
LikeLike
Hi Michel,
just updated to the most recent exchangeonlinemanagement module 3.7.2.
Still have the same error:
.\Remove-DuplicateItems.ps1 -Identity AdeleV@M365x32648601.OnMicrosoft.com -IncludeFolders “inbox” -Type Mail -Retain Newest -DeleteMode MoveToDeletedItems -Report -WhatIf -ClientId “XXX” -TenantId “XXX” -Secret (ConvertTo-SecureString “XXX” -AsPlainText -Force) -verbose
VERBOSE: Module Microsoft.Exchange.WebServices v2.2.1.0 already loaded
VERBOSE: Module Microsoft.Identity.Client v4.53.0.0 already loaded
VERBOSE: Will use provided secret to authenticate
Exception calling “Create” with “1” argument(s): “Could not load file or assembly ‘Microsoft.IdentityModel.Abstractions, Version=6.22.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35’ or one of its
dependencies. The system cannot find the file specified.”
At D:\OneDrive\OneDrive\scripts\Scripte\Remove-DuplicateItems.ps1:1653 char:13
+ $App= [Microsoft.Identity.Client.ConfidentialClientApplic …
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : NotSpecified: (:) [], MethodInvocationException
+ FullyQualifiedErrorId : FileNotFoundException
D:\OneDrive\OneDrive\scripts\Scripte\Remove-DuplicateItems.ps1 : Problem acquiring token: You cannot call a method on a null-valued expression.
At line:1 char:1
+ .\Remove-DuplicateItems.ps1 -Identity AdeleV@M365x32648601.OnMicrosof …
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : NotSpecified: (:) [Write-Error], WriteErrorException
+ FullyQualifiedErrorId : Microsoft.PowerShell.Commands.WriteErrorException,Remove-DuplicateItems.ps1
Used PowerShell ISE as well as PowerShell 5.
With PS7 I get these things:
.\Remove-DuplicateItems.ps1 -Identity AdeleV@M365x32648601.OnMicrosoft.com -IncludeFolders “inbox” -Type Mail -Retain Newest -DeleteMode MoveToDeletedItems -Report -WhatIf -ClientId “XXX” -TenantId “XXX” -Secret (ConvertTo-SecureString “XXX” -AsPlainText -Force) -verbose
VERBOSE: Loading module D:\OneDrive\OneDrive\scripts\Scripte\Microsoft.Exchange.WebServices.dll
VERBOSE: Loading module from path ‘D:\OneDrive\OneDrive\scripts\Scripte\Microsoft.Exchange.WebServices.dll’.
VERBOSE: Module Microsoft.Exchange.WebServices v2.2.1.0 loaded
VERBOSE: Acquiring providers for assembly: C:\program files\windowsapps\microsoft.powershell_7.5.0.0_x64__8wekyb3d8bbwe\Modules\PackageManagement\coreclr\netstandard2.0\Microsoft.PackageManagement.MetaProvider.PowerShell.dll
VERBOSE: Acquiring providers for assembly: C:\program files\windowsapps\microsoft.powershell_7.5.0.0_x64__8wekyb3d8bbwe\Modules\PackageManagement\coreclr\netstandard2.0\Microsoft.PackageManagement.NuGetProvider.dll
VERBOSE: Acquiring providers for assembly: C:\program files\windowsapps\microsoft.powershell_7.5.0.0_x64__8wekyb3d8bbwe\Modules\PackageManagement\coreclr\netstandard2.0\Microsoft.PackageManagement.ArchiverProviders.dll
VERBOSE: Acquiring providers for assembly: C:\program files\windowsapps\microsoft.powershell_7.5.0.0_x64__8wekyb3d8bbwe\Modules\PackageManagement\coreclr\netstandard2.0\Microsoft.PackageManagement.CoreProviders.dll
VERBOSE: Suppressed Warning Unknown category for ‘NuGet’::’GetDynamicOptions’: ‘Provider’
VERBOSE: Loading module D:\OneDrive\OneDrive\scripts\Scripte\Microsoft.Identity.Client.dll
Import-Module: D:\OneDrive\OneDrive\scripts\Scripte\Remove-DuplicateItems.ps1:862
Line |
862 | … Import-Module -Name $absoluteFileName -Global -Force
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| The specified module ‘D:\OneDrive\OneDrive\scripts\Scripte\Microsoft.Identity.Client.dll’ was not loaded because no valid module file was found in any module directory.
Write-Error: D:\OneDrive\OneDrive\scripts\Scripte\Remove-DuplicateItems.ps1:1577
Line |
1577 | Import-ModuleDLL -Name ‘Microsoft.Identity.Client’ -FileName ‘Mic …
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| Problem loading module Microsoft.Identity.Client:
I guess it is inteded to use the script with PS5.
LikeLike
The script is intended to use without EXOM; loadimg EXOM before gets all sorts of dependency issues. Works with 5.x and 7.x
LikeLike
Hi Michel,
sry for the late reply. Just found some time to get back to this topic.
With the help of this page I was able to get it running with PS7.
Thx for the work you did and your patience.
LikeLike
Forgot the page:
https://www.rootmanager.com/tech-notes/using-remove-duplicateitems-script-with-microsoft-office-365.html
LikeLike
The script finishes with the below warning.
VERBOSE: Looking up EWS URL using Autodiscover for user1@evrpslab.com
VERBOSE: Using EWS endpoint https://ex1.evrpslab.com/EWS/Exchange.asmx
VERBOSE: Constructing folder matching rules
WARNING: Cannot bind to MsgFolderRoot: Cannot find an overload for “Bind” and the argument count: “2”.
WARNING: Cannot bind to ArchiveMsgFolderRoot: Cannot find an overload for “Bind” and the argument count: “2”.
VERBOSE: Processing user1@evrpslab.com finished
I am not sure if it has removed any duplicates. I have Exchange 2019 Cu11
LikeLike
I get below error when running the script
VERBOSE:Module Microsoft.Exchange.WebServices v15.0.0.0 already loaded
VERBOSE: Module Microsoft.Identity.Client v4.74.0.0 already loaded
VERBOSE: Using Default Credentials
VERBOSE: Cleanup Mode: Folder
Processing mailbox shared-test@mail.xyz.com (shared-test@mail.xyz.com)
VERBOSE: Using shared-test@mt.gov.sa for impersonation
VERBOSE: Using Exchange Web Services URL
VERBOSE: Constructing folder matching rules
WARNING: Cannot bind to MsgFolderRoot: Exception calling “Bind” with “2” argument(s): “The request failed. The remote server returned an error: (401) Unauthorized.”
WARNING: Cannot bind to ArchiveMsgFolderRoot: Exception calling “Bind” with “2” argument(s): “The request failed. The remote server returned an error: (401) Unauthorized.”
VERBOSE: Processing smailbox a finished
LikeLike
When running the script it gives message to use default credentials. What value we need to put in use default credential
LikeLike
Also it gives below error
WARNING: Cannot bind to MsgFolderRoot: Exception calling “Bind” with “2” argument(s): “The request failed. The remote server returned an error: (401) Unauthorized.”
WARNING: Cannot bind to ArchiveMsgFolderRoot: Exception calling “Bind” with “2” argument(s): “The request failed. The remote server returned an error: (401) Unauthorized.”
VERBOSE: finished
LikeLike
Hello!
I have this issue with the newest version of the script, please help, I have ~95000 duplicates.
WARNING: Error performing operation RemoveItems with . Error: Exception calling “DeleteItems” with “5” argument(s):
“The request failed. The remote server returned the following error: (401) Unauthorized.”
VERBOSE: Cleaning unique list (Finished Folder)
479542 items processed and 94743 removed in 01:06:23 – average 7222 items/min
But its not removed anything.
Thanking in adwance!
Csabi
LikeLike
401 indicates an authentication issue. Check if Exchange Web Services is allowed and you are using an approved (consented to) app to authenticate (Basic Authentication no longer option for Exchange Online). Also check if there might be conditional access rules which block your access.
LikeLike