1 / 49

Exchange Server 2010 High Availability Deep Dive

UCC402. Exchange Server 2010 High Availability Deep Dive. Scott Schnoll Principal Technical Writer Microsoft Corporation. Agenda. Exchange Server 2010 High Availability Deep Dive Database Availability Group Networks Active Manager Best Copy Selection

tala
Download Presentation

Exchange Server 2010 High Availability Deep Dive

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. UCC402 Exchange Server 2010High Availability Deep Dive Scott Schnoll Principal Technical Writer Microsoft Corporation

  2. Agenda • Exchange Server 2010 High Availability Deep Dive • Database Availability Group Networks • Active Manager • Best Copy Selection • Datacenter Activation Coordination Mode

  3. Exchange Server 2010 High Availability Deep Dive: Database Availability Group Networks

  4. DAG Networks • A DAG network is a logical collection of one or more subnets • There are two types of DAG networks • MAPI Network - connects DAG members to network resources (Active Directory, other Exchange servers, DNS, etc.) • Registered in DNS / DNS configured • Uses default gateway • Client for Microsoft Networks/File and Print Sharing enabled • Replication Network - used for/by continuous replication (log shipping and seeding) • Not registered in DNS / DNS not configured • No default gateway • Client for Microsoft Networks/File and Print Sharing disabled

  5. DAG Networks • All DAGs must have: • Exactly one MAPI network • Zero or more Replication networks • Separate network(s) on separate subnet(s) • LRU determines which replication network is used with multiple replication networks • DAG networks automatically created when Mailbox server is added to DAG • Based on cluster’s enumeration of networks, which uses subnets • One cluster network is created per subnet

  6. DAG Networks • Maximum round trip return latency between all DAG members must be 500 ms or less • Regardless of network latency, validate that the network between all DAG members is capable of satisfying your data protection and availability goals • May need to increase the number of databases or decreasing the number of mailboxes per database to achieve goals

  7. DAG Networks

  8. DAG Networks

  9. DAG Networks • Collapse DAG networks and disable replication on MAPI network: Set-DatabaseAvailabilityGroupNetwork -Identity DAG2\DAGNetwork01 -Subnets 192.168.0.0,192.168.1.0 -ReplicationEnabled:$false Set-DatabaseAvailabilityGroupNetwork -Identity DAG2\DAGNetwork02 -Subnets 10.0.0.0,10.0.1.0 Remove-DatabaseAvailabilityGroupNetwork -Identity DAG2\DAGNetwork03 Remove-DatabaseAvailabilityGroupNetwork -Identity DAG2\DAGNetwork04

  10. DAG Networks • Collapse DAG networks and disable replication on MAPI network: Set-DatabaseAvailabilityGroupNetwork -Identity DAG2\DAGNetwork01 -Subnets 192.168.0.0,192.168.1.0 -ReplicationEnabled:$false Set-DatabaseAvailabilityGroupNetwork -Identity DAG2\DAGNetwork02 -Subnets 10.0.0.0,10.0.1.0 Remove-DatabaseAvailabilityGroupNetwork -Identity DAG2\DAGNetwork03 Remove-DatabaseAvailabilityGroupNetwork -Identity DAG2\DAGNetwork04

  11. DAG Networks • All DAGs extended to multiple datacenters should have hotfix from KB 2550886 installed • Automatic detection occurs when members added to DAG • If NICs are added after server is member of DAG, you must perform discovery • Set-DatabaseAvailabilityGroup<DAGName> -DiscoverNetworks • DAG network configuration persisted in cluster database • HKLM\Cluster\Exchange\DAG Network • DAGs include built-in encryption and compression • Encryption: Kerberos SSP EncryptMessage/DecryptMessage APIs • Compression: Microsoft XPRESS, based on LZ77 algorithm

  12. DAG Networks • When using a single NIC • It is both the MAPI and the Replication network • EnableReplication is $True • When using multiple NICs • One NIC is the MAPI network • EnableReplication is $False • Other NIC(s) are Replication network(s) • Replication uses LRU to pick Replication network to use • If Replication networks are unavailable, MAPI network is used

  13. DAG Networks • Use netsh, router ACLs or other means to block cross-network traffic Allowed M M M M Subnet 1 Subnet 3 R R R R Subnet 2 Subnet 4 Blocked

  14. DAG Networks • If using iSCSI storage, configure DAG and cluster to ignore iSCSI networks • Set-DatabaseAvailabilityGroupNetwork-Identity <DAGNetworkName> -ReplicationEnabled:$false -IgnoreNetwork:$true

  15. DAG Networks • When a DAG spans multiple subnets you need an IP address on the MAPI network for each subnet • Use DHCP in site resilience configurations to assign IP addresses to Replication network • Enables delivery of the typically required static routes • If using static IP addresses, use netsh to configure static routes • Configure a DNS TTL on namespace records consistent with your SLA • For example, use a TTL of 5 minutes for a 60 minute RTO SLA

  16. Exchange Server 2010 High Availability Deep Dive: Active Manager

  17. Active Manager • What are the three Active Manager roles? • Standalone • PAM (Primary Active Manager) • SAM (Standby Active Manager) • Transition of role state logged into Microsoft-Exchange-HighAvailability/Operational event log (Crimson Channel)

  18. Active Manager Functionality • Mount and Dismount Databases • Provide Database Availability Information • Provide Interface for Administrative Tasks • Monitor for and React to Failures • Maintains Database and Server State Information

  19. Mount / Dismount Database Copy • Mount Database • An administrator action invoked through a task • The last part of a move operation • Dismount Database • An administrator action invoked through a task • The first part of a move operation

  20. Auto Dismount – DAG Member • Occurs when a DAG loses quorum • All DAG members are running (but may not be participating in the cluster) • Databases dismounted as quickly as possible to avoid split-brain • Information Store service is terminated

  21. Active Manager – Move Database • Move Database • An administrator action invoked by a task • Automatic operation initiated by the PAM (failover) • Begins with a Dismount operation and ends with a Mount operation

  22. Exchange Server 2010 High Availability Deep Dive: Best Copy Selection

  23. Best Copy Selection • Active Manager selects the “best” copy to become the new active copy when the existing active copy fails, or when an administrator performs a target-less switchover • BCS is the process of finding the best copy of an individual database to activate, given a list potential copies for activation and their status • During BCS, any servers that are unreachable or activation blocked are ignored

  24. Best Copy Selection – RTM • Sorts copies by copy queue length to minimize data loss, using activation preference as a secondary sorting key if necessary • Selects from sorted listed based on which set of criteria met by each copy • Attempt Copy Last Logs (ACLL) runs and attempts to copy missing log files from previous active copy

  25. Best Copy Selection – SP1 • Sorts copies by activation preference when auto database mount dial is set to Lossless • Otherwise, sorts copies based on copy queue length, with activation preference used a secondary sorting key if necessary • Selects from sorted listed based on which set of criteria met by each copy • Attempt Copy Last Logs (ACLL) runs and attempts to copy missing log files from previous active copy

  26. Best Copy Selection • Is database mountable? • Is copy queue length <= AutoDatabaseMountDial? • If Yes, database is marked as current active and mount request is issued • If not, next best database tried (if one is available)

  27. Best Copy Selection

  28. Best Copy Selection – RTM • Four copies of DB1 • DB1 currently active on Server1 Server1 Server2 Server3 Server4 X DB1 DB1 DB1 DB1

  29. Best Copy Selection – RTM • Sort list of available copies based by Copy Queue Length (using Activation Preference as secondary sort key if necessary): • Server3\DB1 • Server2\DB1 • Server4\DB1

  30. Best Copy Selection – RTM • Only two copies meet first set of criteria for activation (CQL< 10; RQL< 50; CI=Healthy): • Server3\DB1 • Server2\DB1 • Server4\DB1 Lowest copy queue length – tried first

  31. Best Copy Selection – SP1 • Four copies of DB1 • DB1 currently active on Server1 • Auto database mountdial set to Lossless Server1 Server2 Server3 Server4 X DB1 DB1 DB1 DB1

  32. Best Copy Selection – SP1 • Sort list of available copies based by Activation Preference: • Server2\DB1 • Server3\DB1 • Server4\DB1

  33. Best Copy Selection – SP1 • Sort list of available copies based by Activation Preference: • Server2\DB1 • Server3\DB1 • Server4\DB1 Lowest preference value – tried first

  34. Best Copy Selection • After Active Manager determines the best copy to activate • The Replication service on the target server tries to copy missing log files from source (ACLL) • If successful, database will mount with zero data loss • If unsuccessful (lossy failure), database will mount based on the AutoDatabaseMountDial setting • If data loss is outside of dial setting, next copy will be tried

  35. Best Copy Selection • If an activated database copy is mounted • It will generate new log files (using the same log generation sequence) • Transport Dumpster requests will be initiated for the mounted database to recover lost messages • When original server or database recovers, it will run through divergence detection and either perform an incremental resync or require a full reseed

  36. Exchange Server 2010 High Availability Deep Dive: Datacenter Activation Coordination Mode

  37. Datacenter Activation Coordination Mode • DAC mode is a property of a DAG • Acts as an application-level form of quorum • Controls whether or not a Mailbox server attempts to automatically mount its active databases on startup • Designed to prevent multiple copies of same database mounting on different members due to loss of network (split brain) • Also enables use of Site Resilience tasks • Stop-DatabaseAvailabilityGroup • Restore-DatabaseAvailabilityGroup • Start-DatabaseAvailabilityGroup

  38. Datacenter Activation Coordination Mode • RTM: DAC Mode for DAGs with three or more members that are extended to two Active Directory sites • Don’t enable for two-member DAGs where each member is in different AD site or DAGs where all members are in the same AD site • SP1: DAC Mode can be enabled for all DAGs • If using Third Party Replication (TPR) mode, check with your vendor for guidance on DAC mode

  39. Datacenter Activation Coordination Mode • Uses Datacenter Activation Coordination Protocol (DACP) • A bit in memory (in MSExchangeRepl.exe) set to either: • 0 = can’t auto-mount at startup • 1 = can auto-mount at startup

  40. Datacenter Activation Coordination Mode • Active Manager startup sequence • DACP is set to 0 • DAG member communicates with other DAG members it can reach to determine the current value for their DACP bits • If the starting DAG member can communicate with all other members on the StartedMailboxServers list, DACP bit switches to 1 • If the starting DAG member can communicate with another member, and that other member’s DACP bit is set to 1, starting DAG member DACP bit switches to 1 • If the starting DAG member can communicate with another member, and that other member’s DACP bits are set to 0, starting DAG member DACP bit remains at 0

  41. Datacenter Activation Coordination Mode

  42. Datacenter Activation Coordination Mode

  43. Datacenter Activation Coordination Mode 0 0 1 1

  44. Related Content • UCC305 - Exchange Server 2010 High Availability Design

  45. Resources • Exchange Team Blog • http://aka.ms/EHLO • Exchange 2010 Documentation Library • http://aka.ms/Ex2010Docs

  46. Feedback Your feedback is very important! Please complete an evaluation form! Thank you!

  47. Questions? • UCC402 • Scott Schnoll • Principal Technical Writer • scott.schnoll@microsoft.com • http://blogs.technet.com/scottschnoll • Twitter: @schnoll • You can ask me questions at the “Ask the Expert” zone: • November 10, 2011 12:30 – 13:30

More Related