Can’t Create Mailboxes in Remote Sites

Posted on January 8, 2014 by Michel de Rooij

Recently I got an e-mail from someone who had problems creating mailboxes in a new environment. When trying to create a mailbox, he received a following message stating, “Load balancing failed to find a valid mailbox database.” Apparently, the Mailbox Resources Management Agent (a Cmdlet Extension Agent) could not find an eligible mailbox database candidate.

The MRMA uses the following selection process when picking a candidate for mailbox creation or moving:

Create a list of all mailbox databases;
Remove databases marked for exclusion;
Remove databases out of the management scope;
Remove databases from remote (AD) sites;
Pick a random online, healthy database from the list.

This person had a DAG, two mailbox databases (MDB1, MDB2) and two sites (AMS and LON).

We first checked the more or less obvious, which is to see if databases are not excluded from the provisioning process, so we entered Get-MailboxDatabase | fl *FromProvisioning:

Databases seemed enabled for provisioning. We then checked the status of the active database copies:

The copies looked healthy, but we noticed all databases were mounted in a remote site (derived from the server name starting with LON; we’re working from AMS). Looking back at the database selection process, it explained why it probably didn’t work and since the active copies should be moved back to the preferred site AMS anyway we moved the active copies back:

After moving the active database copies back to the location where we were performing our cmdlets from solved things.

Note that we could have discovered the issue using the Verbose parameter with the cmdlet. For example, New-Mailbox in conjunction with Verbose will show the selection process. The following screenshot shows an unsuccessful selection process considering available databases:

This screenshot shows a successful selection process.

More information on automatic mailbox distribution and controlling its behavior here.

Slow Mailbox (Migration) Throughput and HP NIC Drivers

Posted on October 18, 2013 by Michel de Rooij

A small post on an issue I recently encountered when doing preparations for a mailbox migration. The system was an HP Proliant DL370 G6 system, prepared and configured with the OS and Exchange by the customer’s IT department.

Things looked OK and we were going to perform some test migrations to get an throughput estimation for this configuration to help us organize migration batches. To our dismay, speeds were way lower than we were used to see with similar configurations; mailboxes were migrated with an average speed of 7-8 MB/min, where we used to see something in the 50-120 range.

A quick look on the performance monitor didn’t show anything out of the ordinary, except for very low downstream network throughput. With the servers (the original Exchange server as well as the new one) being on the same subnet and physically next to each other, networking components were also not deemed suspect (other servers on same switch were not experiencing this issue).

I then tried something simple: copying near 100 MB of files from the source Exchange server to the new one. It went at an ridiculous slow speed of 60-80 KB/s. Copying those same files from the new server to the source server was instantly. I verified this against a vacant server on the same switch; copying from and to the source Exchange server on that server was instant, both up- and downstream.

So, if SMB was having trouble getting packets across, that could explain the slow mailbox migration speeds. Attention shifted to the networking configuration on the new Exchange server, which was equipped with a HP NC375i Integrated Quad Port Multifunction Gigabit Server Adapter. I checked the driver version of one of the NC375i’s instances through Network Connections > Properties (of instance) > Configure > Driver (tab). It reported QLogic Corp. driver 4.7.17.926 (qlxgnd64.sys) was used.

After some searching on HP’s support site I discovered an advisory which could apply to my situation as it applies to the same qlxgnd64.sys driver version 4.7.17.926: c03734205, “Advisory: HP NC Network Adapters – Certain HP NC-Series Network Adapters May Experience Very Slow Bandwidth During Large File Transfers on Windows Server 2008 and Windows Server 2008 R2”.

The advisory gives the option to either keep the driver and disabling Large Receive Offload (LRO) or to upgrade to driver version 4.7.18.131. We choose the latter:

After upgrading the driver, we moved a mailbox and and throughput speeds were within the expected range again as we found out when producing a quick stats report using the cmdlet (Exchange 2010):

Get-MoveRequest | Where { $_.Status -eq "Completed" } | Get-MoveRequestStatistics | Select DisplayName,TotalMailboxSize,TotalMailboxItemCount,@{n="Speed MB/min"; e={ [int]($_.BytesTransferred.ToMB() / $_.TotalInProgressDuration.TotalMinutes) }}

In my opinion, it’s another fine example of the value of testing and validating your configuration and any amendments you make before putting them in production and be cautious with what I call “blindly updating” of system components such as drivers or driver packs (e.g. HP’s SPP or Service Pack for ProLiant).

If you don’t have the luxury of a test- and acceptance environment, just as with Service Packs, Rollups and Cumulative Updates, have a waiting period and check the vendor’s support site for any reported issues before implementing updates yourself; according to this discussion on the HP support forum, the issue with the 4.7.17.926 QLogic driver existed for quite some time.

Configuring Anti-Affinity in Failover Clusters

Posted on July 22, 2013 by Michel de Rooij

Many customers nowadays are running a virtualized Exchange environment, utilizing Database Availability Groups, load balanced Client Access Servers and the works. However, I also see environments where it is up to the Hypervisor of choice on the hosting of virtual machines after a (planned) fail-over. This goes for Exchange servers, but also for redundant infrastructure components like Domain Controllers or Lync Front-End servers for example.

So, leaving it to “default” is not a good idea when you want to achieve the maximum availability potential. Think about what will happen if redundant roles are located on the same host and that host goes down. What you want to do is prevent hosts from becoming the single point of failure, something which can be accomplished by using a feature called anti-affinity. This will distribute virtual machines over as much hosts as possible. Where affinity means to have an preference for, like in Processor Affinity for processes, Anti-Affinity can be regarded as repulsion in magnetism.

For VMWare, you can utilize DRS Anti-Affinity rules; I’ll describe how you can configure Anti Affinity in Hyper-V clusters using the AntiAffinityClassNames property (which by the way already exists since Windows Server 2003). And yes, property means it’s not accessible from the Failover Cluster Manager, but I’ve create a small PowerShell script which lets you configure the AntiAffinityClassNames property (in pre-Server 2012 you could also use cluster.exe to configure this property).

Note: For readability, when you see virtual machine(s), read cluster group(s); In Microsoft failover clustering, a clustered virtual machine role is a cluster group.

Now, before we’ll get to the script, first something on how AntiAffinityClassNames works. The AntiAffinityClassNames property may contain multiple unique strings which you can make up yourself. I’d recommend creating logical names based on the underlying services, like ExchangeDAG or ExchangeCAS. When a virtual machine is moved the process is as follows:

When defined, the cluster tries to locate the next preferred node using the preferred owner list;
Does the designated node host a virtual machine with a matching element in their AntiAffinityClassNames property; if not, the designated host is selected; if it is, move to the next available preferred owner and repeat step 2;
If the list is exhausted (i.e. only anti-affined hosts), the anti-affinity attribute is ignored and the preferred owner list is checked again, ignoring anti-affinity (“last resort”).

Traces of Anti-Affinity influencing failover behavior can be found in the cluster event log:

00000648.00000d54::2013/07/22-10:40:33.162 INFO  [RCM] group ex2 should fail back from node 2 to node 3 now due anti-affinity

Usage
Now on to the script, Configure-AntiAffinity.ps1. The syntax is as follows:

Configure-AntiAffinity.ps1 [-Cluster] <String> [-Groups] <Array> [-Class] <String> [[-Overwrite]] [[-Clear]] [<CommonParameters>]

A small explanation of the available parameters:

Cluster is used to specify which cluster you cant to configure (mandatory);
Groups specifies which Cluster Groups (Virtual Machines) you want to configure Anti-Affinity for (mandatory);
Class specifies which name you want to use for configuring Anti-Affinity (optional, AntiAffinityClassName);
When Overwrite is specified, all existing Anti-Affinity class names will be overwritten by Class for the specified Groups, otherwise Class will be added (default);
When Clear is specified, all existing Anti-Affinity class names will be removed for the specified Groups;
The Verbose parameter is supported.

So, for example assume you have 3+ Hyper-V cluster named Cluster1 consisting of 3+ nodes running 3 virtualized Exchange servers hosting a 3-node DAG, ex1, ex2 and ex3 and you want to configure anti-affinity for these virtual machines using the label PRODEX, you could use the script as follows :

Configure-AntiAffinity.ps1 -Cluster Cluster1 -Groups ex1,ex2, ex3 –Class PRODEX –Verbose

To clear anti-affinity you could use:

Configure-AntiAffinity.ps1 -Cluster Cluster1 -Groups ex1,ex2,ex3 -Clear

Here’s a screenshot of the script for creating anti-affinity, add additional anti-affinity class names and clearing anti-affinity settings:

Feedback
Feedback is welcomed through the comments. If you got scripting suggestions or questions, do not hesitate using the contact form.

Download
You can download the script from the TechNet Gallery here.

Revision History
–

Exchange 2013 CU1 Help File

Posted on April 5, 2013 by Michel de Rooij

A quick post as the Exchange 2013 Cumulative Update 1 Help file (.CHM) file for offline usage has been released on the Microsoft Download Center.

The offline help files files are convenient if you’re on the road or in a location (yes, that happens sometimes) without internet connection.

You can download the Exchange 2013 Cumulative Update 1 .CHM Help file dated April 4th, 2013 for On-Premise and Hybrid deployments here.

Exchange 2013 Help Files Updated

Posted on January 20, 2013 by Michel de Rooij

A quick post as the Exchange 2013 Help (.CHM) files on the Microsoft Download Center have been updated. The offline help files files are convenient if you’re on the road or in a location without internet connection.

You can download the updated files dated January 18th, 2013 for On-Premise and Hybrid deployments of Exchange 2013 here.

On another note, there’s a new Office Visio 2013 stencil for Exchange 2013, including on-premise and hybrid deployments. You can download it here.