Exchange 2013 – The things they don’t tell you

(This post is a work in progress, at the moment I’m still migrating things so I have the joy of moving to the new public folder architecture in a few weeks, I’m sure this post will grow after that!!)

So the migration is all but done:
One final server to remove (once we track down all the services using it for SMTP relay) but that has been a mostly painless experience. 

  • Your Content Index will fail and you’ll fix every-so-often by deleting the index and restarting the search services, but there is a proper fix:
  • Don't remove the Self Signed Certificate, it's used by the Back End services. After I had everything running I was getting a bit annoyed that I still had the Self Signed cert sitting there, it is still bound to IIS and SMTP. After a lot of reading I found the following KB article which basically says "leave it alone":
    http://support.microsoft.com/kb/2779694
  • Make sure you receive connectors are scoped so voicemail does not start failing with "Event ID 1446, MSExchange Unified Messaging,
    The Microsoft Exchange Unified Messaging service on the Mailbox server failed to process the message with header file "FILENAME" within "11" minutes. The server will continue to process and deliver the message, but the "MSExchangeUMAvailability: % of Messages Successfully Processed Over the Last Hour" performance counter will be decreased.  http://exchangepro.dk/2014/05/06/voicemail-not-delivered-to-mailbox/
  • Play on phone from the full fat Outlook 2010 client was failing for users still homed on the Exchange 2010 servers but working fine for users homed on Exchange 2013. Play on phone comes up with the following error:



    Strangely playing the same voicemail via OWA works fine
    I've done no further trouble shooting as the users will all be moved to 2010 within the week and we have the above work around in place. If anyone wants to jump in with a suggestion in the next few days I'll be glad to hear it.
  • Running the Public Folder upgrade script (Export-MailPublicFoldersForMigration.ps1) gives warnings about Property errors:

    "WARNING: The object Domain.local/Microsoft Exchange System Objects/Meeting Room 5 (10) has been corrupted, and it's in an inconsistent state. The following validation errors happened:
    WARNING: Property expression "Board Room" isn't valid. Valid values are: Strings formed with characters from A to Z (uppercase or lowercase), digits from 0 to 9, !, #, $, %, &, ', *, +, -, /, =, ?, ^, _, `, {, |, } or ~. One or more periods may be embedded in an alias, but each period should be preceded and followed by at least one of the other characters. Unicode characters from U+00A1 to U+00FF are also valid in an alias, but they will be mapped to a best-fitUS-ASCII string in the e-mail address, which is generated from such an alias."


    For me this was some of the Public Folders not having valid alias entries. It didn't cause a problem (as far as I can see) in all the time we have been running Exchange 2010 but needed to be fixed while moving to 2013 and its new PF architecture.

    I had an additional problem that I couldn't get into the item to edit it in the Public Folder Management Console without applying the following fix (the error was about "no existing publicfolderproxyinformation" sorry, didn't screenprint before fixing!): https://workinghardinit.wordpress.com/2010/11/04/exchange-2010-public-folder-worries-at-customer-no-existing-publicfolderproxyinformation-matches-the-following-identity/ which turns out to be a problem with the homeMDB property of the Microsoft System Attendant!

    Then editing the item to remove the space fixed the issue:

  • While migrating the Public Folders we got to one of the final stages and the process appeared to hang.
    Running Get-PublicFolderMigrationRequest | Get-PublicFolderMigrationRequestStatistics gave the response "StalledDueToMailboxLock" and :
    Informational: The request has been temporarily postponed because the mailbox is locked. The Microsoft Exchange Mailbox Replication service will attempt to continue processing the request after 26/01/2015 17:21:07.

    The solution was to restart the Information Store Service on the Legacy Mailbox server and wait 10 minutes for the process to continue.



And as a bonus feature - TLDR: Exchange Unified Messaging Certificates need simply be the server Fully Qualified Domain Name with no additional SAN names.

For my pain simply read on...

I have three new Exchange 2013 servers:
  • FCH-XS13-01.fch.local
  • FCH-XS13-02.fch.local
  • FCH-XS13-03.fch.local
 
(and before someone complains that I’ve given away the names of the Exchange server in my organisation I counter with – “you’d get that simply by receiving an email from anyone in the organisation and looking at the headers” <sigh>)
 
When I created the certificates I used the brilliant Digicert Tool and bashed the FQDN in:
 
fch-xs13-01.fch.local
 
threw in the NetBIOS name for kicks
 
fch-xs13-01
 
sent the response off to my local CA, downloaded the response. Imported it and applied in the ECP (I know, no PowerShell, naughty me).
 

After restarting the two UM services:
 
 
 
No Voicemail – calls just hung up. And not simply for my test mailbox on the Exchange 2013 environment – this was effecting everyone as the Exchange 2013 servers proxy the 2010 traffic
 
Looking in the event log we see Event ID 36884:
 
The certificate received from the remote server does not contain the expected name. It is therefore not possible to determine whether we are connecting to the correct server. The server name we were expecting is fch-xs13-03.fch.local. The SSL connection request has failed. The attached data contains the server certificate.
 
 
Event ID 1649:
 
The Microsoft Exchange Unified Messaging Call Router service failed to exchange the required certificates with an IP gateway to enable Transport Layer Security (TLS). Please check that the gateway is configured to operate in the correct security mode. If the gateway is required to operate in TLS mode, check that the certificates being used are correct. More information: "A TLS failure occurred because the remote server disconnected while TLS negotiation was in progress. The error code = 0x80131500 and the message = Unknown error (0x80131500).". Remote certificate:  (). Remote end point: [::1]:11346. Local end point: [::1]:5061.
 
 
Event ID 1113:
 
The Client Access server failed to exchange the required certificates with an IP gateway to enable Transport Layer Security (TLS). Please check that the gateway is configured to operate in the correct security mode. If the gateway is required to operate in TLS mode, check that the certificates being used are correct. More information: "A TLS failure occurred because the remote server disconnected while TLS negotiation was in progress. The error code = 0x80131500 and the message = Unknown error (0x80131500).". Remote certificate:  (). Remote end point: [::1]:11169. Local end point: [::1]:5063.
 
 
 
On the Lync 2010 servers we see the following:
 
Event ID 14366
 
Multiple invalid incoming certificates. In the past 1 minutes the server received 1 invalid incoming certificates. The last one was from host 172.28.1.63.Cause: This can happen if a remote server presents an invalid certificate due to an incorrect configuration or an attacker.Resolution:No action needed unless the number of failures is large. Contact the administrator of the host sending the invalid certificate and resolve this problem.
 
So, I’m off on a hunt. And it take me hours. Finally after comparing everything I can think of I decide to recreate the certificates (for what feels like the 1 zillionth time) and while I’m doing so this comes to me:
 
The server name we were expecting is fch-xs13-03.fch.local
 
And the certificate I was giving out had two names, the Common Name and the Subject Alternative Name….. so sometimes the service was seeing the NetBIOS SAN name - BINGO. I guess I’ve been spoiled by SAN names on Lync and Exchange so just through “throw it in, what's the worse that’ll happen" - It turns out, a lot!

No comments:

Post a Comment