So long RPC/HTTP, Hello MAPI/HTTP

Ex2013 LogoMicrosoft published three sessions from the Redmond Interoperability Protocols Plugfest 2013 on Channel 9 on the protocol MAPI over HTTP or MAPI/HTTP which looks scheduled to arrive with Exchange 2013 Service Pack 1.

This protocol is set to (over time!) replace the RPC/HTTP protocol we all know. RPC/HTTP, or Outlook Anywhere, is used by Outlook to communicate with Exchange Server and is most often seen with clients working remotely. With Exchange Server 2013, support for MAPI was dropped and RPC/HTTP became the only protocol. With Exchange 2013 SP1 it seems we will receive an alternative which is set to replace RPC/HTTTP, MAPI/HTTP.

Of course, the information is preliminary and subject to change as Exchange 2013 SP1 hasn’t been released yet, but it won’t harm to get familiar with the planned changes. It also remains to be seen how quick organizations will adopt this new protocol, which I’m pretty sure we will soon see getting supported by Office 365.

MapiHttp in Exchange 2013 SP1
Joe Warren, Principal SDE at Microsoft delivering a presentation covering the Exchange 2013 MapiHttp protocol implementation in Exchange 2013 SP1. Topics: What is MAPI-HTTP?, Why do MAPI-HTTP?, Goal of MAPI-HTTP, Why not rebuild with EWS?, Existing RPC-HTTP, New MAPI-HTTP, What does a MAPI-HTTP request look like?, What does a MAPI-HTTP response look like?, Session Context, Request Types, Sequencing & Protocol Failures. Click here.

Outlook 2013 Client Protocols
Shri Vidhya Alagesan, SDE at Microsoft presenting on Outlook 2013 Client Protocols from a client’s perspective. Topics: Client side view of Outlook-Exchange MAPI-HTTP protocol using WinHTTP, Error Handling & RPC Vs. MAPI-HTTP with sub-topics of Architecture Overview, Outlook & WinHttp, Cookies, Connection Status Dialog, Timeout, Pause/Resume & Protocol Switching. Click here.

Exchange 2013 Protocols
Andrew Davidoff, Senior Software Developer Engineer in Test at Microsoft presenting on the Exchange 2013 protocol families and important protocol updates for Exchange 2013. Click here.

Apart from these sessions on protocol change announced for Exchange Server 2013 SP1, Microsoft also published some other interesting Exchange-related sessions:

Exchange 2013 Web Services Overview
Harvey Rook, Principal Development Lead, and Naveen Chand, Senior Program Manager Lead, deliver a presentation on Exchange Web Services best practices. Click here.

Exchange RPC and EWS Protocol Test Suites
Jigar Mehta, Software Development Engineer in Test provides an in depth overview of the test suite packages for the Exchange RPC and Exchange Web Services protocols. Click here.

Exchange and potential Packet Loss on VMWare

technical_support_outage_advisory[1]Yesterday, I noticed a VMware knowledgebase article, updated on November 14th, which could be worth taking notice of when you’re running Exchange – or any other application – in a virtualized environment based on VMware technology.

VMware’s KB article 2039495 mentions that in VMware ESXi 4.x and 5.x, very high traffic bursts may cause the VMXnet3 driver to start dropping packets in the Guest OS. This has been observed on Windows Server 2008 R2 running Exchange 2010 with – as VMware puts it – a high number of Exchange users. What the article fails to mention is the configuration used by customers experiencing the issue. It might for example be valuable to know if a DAG was used, if the traffic (MAPI, replication) was split over multiple NICs or if it occurred with iSCSI storage. I won’t be surprised if the issue occurs with other high traffic situations as well, e.g. seeding. Luckily, Exchange is capable of handling certain hiccups so customers might not be even aware of the issue.

After some more digging I found another article, KB 1010071, which mentions a packet drop issue with VMware Guests known since ESX 3. This article explains a bit more why the issue occurs in the first place, being the network driver running out of receive buffers, causing the packets to be dropped between the Virtual Switch and the Guest OS driver.

One could argue about the impact of a few lost packets. However, as traffic increases the (potential) number of lost packets increases. Each lost packet results in retransmission of unacknowledged packets, which impacts overall throughput causing increased latencies.

VMware’s temporary solution to this problem is:

  1. Open up the Windows guest;
  2. Open the properties of the VMXNET3 NIC;
  3. On the Advanced tab, increase the Small Rx Buffers or Rx Ring #1 Size;
  4. What KB1010071 mentions and KB2039495 doesn’t, is that when using jumbo frames – not seldom used, e.g. replication  – you might need to adjust the Rx Ring #2 size and Large Rx Buffers values.

Now I say temporary, because VMware’s solution of course isn’t  a real solution; it’s only meant to – in their own words – reduce packet drops. Also, the KB1010071 article states you should “determine an appropriate setting by experimenting with different buffer sizes”. That doesn’t sound like an permanent, assuring solution for a virtualization environment running business critical applications now, does it?

All things considered, I’d recommend configuring these parameters to their maximum setting, preferably at installation time, unless anyone knows of a reason not to. In addition, this is another case for the best practice to split MAPI and replication traffic on Exchange using multiple NICs.

Finally, I already learnt of two other applications experiencing the issue. Therefor I think the problem is not Exchange 2010 specific, as KB2039495 might imply. If you have similar experiences, experienced differences between GbE and 10Ge, please use the comments to share.

Load balancing Exchange 2010 using a KEMP Virtual LoadMaster

In an earlier blog, I mentioned the requirement for an external load balancer when co-locating Exchange server roles, because Failover-Clustering and Network Load Balancing (NLB) are mutually exclusive. However, there are also situations when a load balancer is a better solution over Windows built-in NLB, mainly because there are some things NLB can’t do or doesn’t do well, like:

  • Service awareness: NLB distributes clients over member nodes, even over nodes of which required services, like IIS or RPC Client Access Service, are not responding;
  • Experience: Clients need to reconnect after adding or removing nodes;
  • Scalability : it’s not recommended to scale NLB beyond 8 nodes;
  • Affinity (also known as persistence or sticky sessions): NLB can only do Source IP affinity, i.e. distribute clients based on their IP address, while load balancers can utilize cookies or SSL session IDs.

Note: Why affinity is important and why Source IP can be bad sometimes, you can read in one of my earlier blogs on load balancing Exchange ActiveSync here.

To show you setting up a load balancer doesn’t have to be rocket science, I’ll demonstrate how to implement a load balancer for Exchange 2010 using a KEMP Virtual Loadmaster (VLM); setting up other load balancers should be similar, hardware appliances included, but keep in mind implementations by vendors vary, so check the product documentation as well. However, the basics are same, you only need to understand what you’re trying to achieve.

Note: The KEMP’s VLM used for this article runs on Hyper-V, but there are virtual load balancers for different hypervisors as well.

The setup we’re going to work with is roughly as follows:

Kemp-HA-Setup-v1

In the sample environment, I’ve installed two Exchange 2010 servers, L12EX1 and L12EX2; both hold the Client Access, Hub Transport and Mailbox server roles. The domain name used is litware.com, and we have no site nor subnet definitions, so everything is located in the default Active Directory site, Default-First-Site-Name. Clients will access Exchange services (HTTPS, MAPI) using a single FQDN, outlook.litware.com.

The Exchange servers are located in a dedicated subnet, so we’ll use a so called two-armed setup (2 NICs); one NIC will connect the VLM to the subnet where the Exchange servers are located; the other one will be used for client access. In order to have the VLM work transparently, we configure the VLM as default gateway on the CAS servers. The result is that the CAS servers will see the original client IP addresses instead of the VLM’s address, which is not only helpful in log files, but is also needed for throttling or when limiting SMTP connections to Receive Connectors based on IP addresses for example.

Note: This article doesn’t describe implementing SSL offloading; for more information on SSL offloading and how to configure it, check this Technet article. Also, this article doesn’t go into any built-in ability of load balancers to mirror or create standby copies, meant to prevent the load balancer from becoming a Single Point Of Failure (SPOF) or improve Availability level.

We’ll start off by downloading the KEMP Virtual Loadmaster here. After downloading, extract the contents and import the VM in Hyper-V. After firing it up, it will use DHCP or 192.168.0.1 if DHCP is unavailable. You can check the console to see what IP address is used:

image

Now, before we can configure the VLM, we need to perform the initial setup:

  • Use the console to log in using the administrator account or connect with a browser to the VLM’s IP address;
  • If you haven’t got an activation key, you can apply for a trial key;
  • Complete licensing of the VLM;
  • Configure VLM network interfaces;
  • Import Configure certificate

Note: Make sure you set the MAC addresses of your NICs to static. When going through the licensing process, the access code is based on MAC address. If you don’t, the license will be invalidated if you migrate to a different host.

Note: We’re going to load balance services over port 443 and the administrative web interface uses that port as well, so configure the GUI on a different IP address or port.

Next, we need to create a Client Access Server Array. Note that creating a CAS Array before creating or moving mailboxes is best practice, as it prevents having to reconfigure Outlook MAPI profiles when clients have already connected (unless you want to perform mailbox move tricks to force MAPI reconfiguration). Basically, the steps to perform are:

  • Create a DNS record with FQDN which is going to be used for clients to connect. In our example, the FQDN used is outlook.litware.com using IP number 172.16.10.100;
  • Create a CAS Array object using New-ClientAccessArray, i.e.New-ClientAccessArray -Name outlook-default -Fqdn outlook.litware.com -Site Default-First-Site-Name

image

  • As per best practice, we’re fixing the RPC (59531) and Addressbook (59532) ports by setting the following registry keys on each CAS server and restarting the related MSExchangeRpc and MSExchangeAB services:

HKLM\System\CurrentControlSet\Services\MSExchangeRPC\ParametersSystem\TCP/IP Port = 0xe88b (59531)  REG_DWORD

HKLM\System\CurrentControlSet\Services\MSExchangeAB\Parameters\RpcTcpPort = ”59532” (REG_SZ)

You can verify Exchange is listening on these ports using netstat –an | find “5953”.

image

  • Finally, we need to configure the mailbox databases with the new RPC endpoint using Set-MailboxDatabase in conjunction with the RpcClientAccessServer parameter:Get-MailboxDatabase | Set-MailboxDatabase -RpcClientAccessServer outlook.litware.com

Note: More information on creating CAS Arrays, check here.

After creating the CAS array, fixing the ports on Exchange and reconfiguring the RPC endpoint configuration on mailbox databases, configure the Exchange URLs to match the new client endpoint FQDN, outlook.litware.com. To so so, use cmdlets like Set-OWAVirtualDirectory –InternalURL https://outlook.litware.com/owa or Set-WebServicesVirtualDirectory –InternalURL https://outlook.litware.com/EWS/Exchange.asmx. In addition to InternalURL, set the ExternalURL as well depending on your setup, i.e. HTTPS services may be load balanced at the reverse proxy.

Now we’re ready to configure the VLM. We start off by creating Virtual Services, which are a combination of IP address and ports. Each Virtual Service has it’s own characteristics, like persistence, scheduling (distribution), can have its own certificate, distribution mechanism and appointed set of real (backend) servers and related service monitors.

We decided to use a single IP address for the various Exchange services, so we only need to configure a single Virtual Service for each port, via Virtual Services > Add New:

image

In the next screen you need to configure the Virtual Service settings like persistence and scheduling, as well as configure the real servers, i.e. the backend servers actually providing the service. You can also configure how the service health on the real server is monitored, i.e. is the service up or down. If a service on a real server is considered down, the load balancer won’t send clients to that server for that particular Virtual Service.

Note: The overview below is taken from a non-SSL offloading (SSL acceleration) configuration; when enabled, it will show additional options on the certificate to use.

image

Note: When using “Least Connection” persistence as recommended in the KEMP documentation, be advised a client traffic storm can occur after the Real Server comes online. Reason is it starts without connections, so all new clients will be directed to this server. Other products have mechanisms in place to prevent this by throttling traffic, gradually increasing the connections; F5 calls this feature Slow Ramp Timeout in their F5 BIG-IP Local Traffic Manager products.

When configuring the Virtual Service, click Add New to add a Real Server to the Virtual Service.

image

A suggestion on how to configure the Virtual Services:

Virtual Address Port Service Name Persistence Scheduling
172.16.10.101 443 Exchange-HTTPS Super HTTP Round Robin
172.16.10.101 59531 Exchange-RPC Source IP Round Robin
172.16.10.101 59532 Exchange-AB Source IP Round Robin
172.16.10.101 135 Exchange-EPM Source IP Round Robin

Note: When required, you can also load balance inbound SMTP traffic using ports 25/587, IMAP4 (ports 143/993) and POP (110/995) using no persistence.

Note: Using Source IP can result in an unbalanced distribution of client load, when SNAT devices come into play. For an example scenario, see my earlier article on Load balancing, ActiveSync and Affinity.

And that’s basically it. When you want to channel specific HTTP services (Outlook WebApp, Exchange ActiveSync, Autodiscover etc.) you can appoint different FQDNs for each service and configure different FQDN/IP addresses per service in DNS, after which you can configure separate Virtual Services with more specific options. For example, you can not only configure specific persistence or scheduling settings for per Virtual Service, but also Real Servers checks (depending on the protocol). Instead of checking if a Real Server responds on port 443, you can check if the server responds on a different URL, e.g. https://<server>/owa.

image

Another bonus of using a load balancer, depending on functionality of the product used of course, is that you can (temporarily) disable a real server from the VLM. After doing this, clients won’t be directed to the corresponding Exchange server, which is very useful when you want to perform maintenance.

image

In this article we quickly went through setting up a KEMP VLM to load balance Exchange 2010 services. However, the article is based on certain decisions regarding the configuration, which can differ from organizational to organization. For more information on deploying KEMP VLM and its possibilities, check out the KEMP Virtual LoadMaster Deployment Guide here.

Most vendors, like KEMP, provide template functionality, which enables you to quickly set up the load balancer using preconfigured settings; make sure you inspect those settings afterwards (i.e. know what you’re doing). You can download KEMP templates here. Unfortunately, these files are in binary format so you can’t edit them nor can you export Virtual Services, otherwise I could have provided you with the template for the above settings.

Be advised that I am in no way connected to KEMP and this article hasn’t been sponsored  or commissioned by KEMP technologies, apart from providing an NFR license for writing and testing purposes.

Exchange can’t start due to misconfigured AD sites

Recently, a customer had issues with their Exchange server which didn’t start properly after rebooting. After checking out the Eventlog, I noticed the it was full of messages, generated by all services. The most interesting events were the ones generated by MSExchange ADAccess:

MSExchange ADAccess, EventID 2141
Process STORE.EXE (PID=2996). Topology discovery failed, error 0x8007077f

MSExchange ADAccess, EventID 2142

Process MSEXCHANGEADTOPOLOGYSERVICE.EXE (PID=1760). Topology discovery failed, error 0x8007077f

Also, the results of the active directory discovery process generated every 15 minutes, which are normally logging in event 2080, “Exchange Active Directory Provider has discovered the following servers with the following characteristics”, was missing.

Note that because the system could start the Microsoft Exchange Active Directory Topology service (until it failed and is restarted by dependent services), Exchange’s other services were also triggered, leading to almost indefinitely restarting services as configured in their corresponding service recovery actions sections.

Now, since I had connected to a domain controller using an RDP session from my client, and I was able to connect to port 389 (Global Catalog) from Exchange using LDP, so communications looked ok. Then, I switched to Active Directory Sites and Services:

image

As you can see from the shot, here was a potential cause of the problem. First, there was a site without domain controllers. Second, there were no subnets defined. So, in this situation, it is undetermined in which site Exchange is located.

When a system can’t be determined to which site a computer belongs, the function DSGetSiteName, used to retrieve the current site, returns an error 1919 0x77f (ERROR_NO_SITENAME). Consequently, the Exchange Active Directory discovery process fails and eventually Exchange fails. You can inspect the current discovered site using nltest /dsgetsite or by having a peek in the registry at HKLM\System\CurrentControlSet\Services\Netlogon\Parameters\DynamicSiteName.

Now, to solve the situation we have three options:

  1. Making the site association static using a registry key, which isn’t a best practice.If you must, set registry key HKLM\SYSTEM\CurrentControlSet\Services\Netlogon\Parameters\SiteName (REG_SZ) to the desired site name;
  2. Adding proper subnet definitions;
  3. Remove the empty site definition.

It turned out the empty site was a place holder for a future site, so we went with the option of adding proper subnet definitions. After adding subnet definitions, like you normally should when working with multiple sites, including the scopes where the Exchange servers and domain controllers were located, and associating it with the main site, things started working again.

Note that the NetLogon service determines site association membership at startup and every 15 minutes. The Microsoft Exchange Discovery Topology service maintains this information by caching the information in the msExchServerSite attribute of the Exchange server object, in order to reduce load on active directory and DNS. Therefor, you might need to wait or restart Microsoft Exchange Discovery Topology  if you want to renew site association membership.

TechEd North America 2012 sessions

With the TechEd North America 2012 event still running, recordings and slide decks of finished sessions are becoming available online. Here’s an overview of the Exchange-related sessions:


Visio of Exchange 2010 SP1 Network Ports Diagram v0.31

By popular demand and since many of you requested this: I’ve put the Visio file of the Exchange 2010 SP1 Network Ports Diagram online. The original post in PDF format is of April 5th.

If you got any comments or additions worth sharing, do not hesitate to write ‘em down in the comments or send me an e-mail. When used, crediting or a reference is appreciated.

The Visio document can be downloaded from here.

Exchange 2010 SP1 Network Ports Diagram v0.31

It took a while, but I’ve updated the Exchange 2010 SP1 Network Ports diagram I first published in December. Note that the updated version is based on SP1, which you can find in the way to change the address book service for example.

For this version, I’ve included clients, 3rd party SMTP elements, UM and OCS/Lync components and a small list of how to change ports or fix dynamic port settings.

You can download the diagram here. When you got feedback, use the comments or send me an e-mail. Otherwise, feel free the use it; crediting or a reference is appreciated.

Update: Small correction, 135/TPC RPC endpoint mapper from Outlook to Client Access Server was missing (Thanks Maarten).

Update (13Aug11): The Visio can be downloaded through here.

Exchange 2010 Endpoint Mapper Issue & Firewall

While upgrading one of my existing Exchange 2010 lab machines from RTM to SP1, I encountered the following error message during the upgrade:

Error:
The following error was generated when "$error.Clear();
          if (!(get-service MSExchangeADTopology* | where {$_.name -eq "MSExchangeADTopology"}))
          {
            install-ADTopologyService
          }
        " was run: "There are no more endpoints available from the endpoint mapper. (Exception from HRESULT: 0x800706D9)".
There are no more endpoints available from the endpoint mapper. (Exception from HRESULT: 0x800706D9)

The message appeared at the stage of upgrading the Unified Messaging components. I had a look at the ExchangeSetup.log file and it contained the the following information:

[08/27/2010 10:08:13.0948] [2] Beginning processing install-UMService
[08/27/2010 10:08:14.0011] [2] [WARNING] An unexpected error has occurred and a Watson dump is being generated: There are no more endpoints available from the endpoint mapper. (Exception from HRESULT: 0x800706D9)
[08/27/2010 10:08:14.0027] [2] [ERROR] There are no more endpoints available from the endpoint mapper. (Exception from HRESULT: 0x800706D9)
[08/27/2010 10:08:15.0823] [1] The following 1 error(s) occurred during task execution:
[08/27/2010 10:08:15.0823] [1] 0.  ErrorRecord: There are no more endpoints available from the endpoint mapper. (Exception from HRESULT: 0x800706D9)
[08/27/2010 10:08:15.0823] [1] 0.  ErrorRecord: System.Runtime.InteropServices.COMException (0x800706D9): There are no more endpoints available from the endpoint mapper. (Exception from HRESULT: 0x800706D9)
at Interop.NetFw.INetFwRules.Add(NetFwRule rule)
at Microsoft.Exchange.Security.WindowsFirewall.ExchangeFirewallRule.Add()
at Microsoft.Exchange.Configuration.Tasks.ManageService.Install()
at Microsoft.Exchange.Management.Tasks.UM.InstallUMService.InternalProcessRecord()
at Microsoft.Exchange.Configuration.Tasks.Task.ProcessRecord()
at System.Management.Automation.CommandProcessor.ProcessRecord()

It seems the error is caused while trying to add a firewall rule, indicated by Interop.NetFw.INetFwRules.Add (INetFwRules is the rules collection of the built-in Windows Firewall).

I had a quick check with the firewall settings on the machine and it turned out the Windows Firewall was disabled. I figured that perhaps adding the rules failed because setup couldn’t communicate with the firewall service.

I enabled the Windows Firewall and this time the upgrade process went fine:

[08/27/2010 10:23:10.0988] [2] Beginning processing install-UMService
[08/27/2010 10:23:11.0145] [2] Ending processing install-UMService

 

Exchange 2010 & Outlook 2003 Notifications

Update (13 apr 2011): Rollup 3 for Exchange 2010 SP1 contains UDP support. To enable it, apply RU3 and set HKLM\SYSTEM\CurrentControlSet\Services\MSExchangeRPC\ParametersSystem\EnablePushNotifications to 1 (REG_DWORD). More information in support article kb2009942.

New e-mail notifications from Exchange to Outlook, we receive them all the time. Most of us never look at the technique, because in most cases this works so there’s no need. But what if it doesn’t or you are experiencing delays? With Exchange 2010 this situation is more likely to occur than with earlier versions of Exchange, because many people are still using Outlook 2003 or earlier clients.  To understand why this happens, you need to understand how these notifications work (or should I say worked).

Note: To improve readability, you should read “Outlook 2003 or earlier versions in online mode” when it reads “Outlook 2003″ from here on, unless states otherwise.

When Outlook 2003 connects to Exchange, it tries to register itself to receive notifications. If registration is successful, Outlook 2003 tells Exchange on what port it expects (UDP) packages to arrive, and it by default this is in the port range 1024-65535. When sending notifications, the Exchange server will also open a dynamic port in this range and connect to the registered client port. After receiving the notification, Outlook 2003 will retrieve the message, will display it in the appropriate folder, make a sound, show a systray icon, change your cursor, etc. When the registration for new mail notifications fails, Outlook 2003 will use a polling mechanism the check for changes.

Now, with Exchange 2010 this behavior has changed because Exchange 2010 does not send these kind of notifications to Outlook 2003 (i.e. UDP notifications were removed). Therefor, Outlook 2003 will revert to polling, which by default is set to 1 minute. This means in worst case users will be notified of new e-mail after approximately 1 minute, where (sort of) real-time feedback is expected. To make things worse in terms of user experience, this also means delays in visible feedback on any folder updates, e.g. e-mail seems to stay in outbox, deleted items not being deleted, moved items not being moved, etc.

The related knowledge base article (kb2009942) mentions two solutions. One solution is a mere pretext and explains increasing the polling frequency. To do so, it requires applying Exchange 2010 Rollup 1 on the CAS server and configuring the following registry key on that CAS server:

HKLM\CurrentControlSet\Services\MSExchangeRPC\ParametersSystem\Maximum Polling Freqeuency (DWORD, range 5000-120000)

The reason for performing this step on the CAS server is that Exchange 2010 will determine the polling frequency, not the client. The setting will work immediately, but clients need to reconnect in order for the new value to become effective. Note that setting this value lower than 5000 has no effect because Outlook 2003’s minimum poll rate is 5000.

Another solution is to enable cached mode for Outlook 2003 clients. This will not solve the delay in receiving new e-mail notifications, but it will solve the most annoying issue, being the delay in visual feedback. In cached mode users won’t notice the delay because they’re working with a local copy of their mailbox. Any changes (sends, deletes, moves) will happen in the local cached file (OST), and Outlook will update their Exchange mailbox in the background.

The article fails to mention the third solution: upgrade! The reason Outlook 2007 doesn’t have this issue is that Outlook 2007 (and later) support a third method: asynchronous (push notification). And as you’ve probably guessed, Exchange 2010 (and Exchange 2007) supports this method as well.