Category Archives: 13184

Cannot Failover CCR Cluster Because Copy Status is Initializing

You may find that your Exchange 2007 CCR cluster will not failover to the other node because continuous replication is not in a healthy state.  Here’s what you see from the Exchange Management Shell (EMS):
[PS] C:\>Move-ClusteredMailboxServer exchange2 -TargetMachine CCR2 -MoveComment:”Failover Test”

Confirm
Are you sure you want to perform this action?
Moving clustered mailbox server “exchange2″ to target node “CCR2″ with move comment “Failover Test”.
[Y] Yes [A] Yes to All [N] No [L] No to All [S] Suspend [?] Help (default is “Y”): y
Move-ClusteredMailboxServer : Continuous replication is in a failed, seeding, or suspended state on ‘First Storage Group’. Move-ClusteredMailboxServer cannot be performed if one or more of the server’s storage group copies are in failed, seeding or suspended states.
At line:1 char:28
+ Move-ClusteredMailboxServer <<<< exchange2 -TargetMachine CCR2 -MoveComment:”Failover Test”
[PS] C:\>
If you check the storage group in the Exchange Management Console (EMC), you will see that the Copy Status is Initializing, as shown below:


This happens because the CCR copy status stays in an initializing state until at least one transaction log has been replicated to the target node. 

As you probably know, the transaction logs in Exchange 2007 are 1MB in size, as opposed to 5MB in previous versions of Exchange.  To get the copy status to Healthy, simply send 1MB of email to or from a user on that storage group.  Now you can failover the CCR cluster to the other node.

Configuring Domain Controller Usage in Exchange 2007 CCR Geo-Clusters

Exchange 2007 Cluster Continuous Replication (CCR) can be configured to span different geographic sites.  These are sometimes called “stretch” or “geographically dispersed” clusters.  In Windows Server 2003, special networking configurations need to be made to stretch a single subnet across the two geographically dispersed locations.  This is made much easier using Windows 2008, since the 2008 clustering service can span different subnets, one in each location.

Even so, Exchange 2007 requires that both nodes of the CCR cluster must reside in the same Active Directory site.  Best practice says that there should be redundant Global Catolog servers in each location, in case of an outage in either location.  The trouble is that if each node of the CCR cluster and all the Global Catalogs reside in the same AD site, Exchange servers may (probably will) bind to a GC that is not in the same geographic location as the server, which can lead to problems. 

Consider the following example:


A CCR geo-cluster exists in an Active Directory site called E2K7.  NODE1 is in San Francisco and NODE2 is in Las Vegas.  There are two Global Catalog servers in each site, SFDC1 and SFDC2 in San Francisco and LVDC1 and LVDC2 in Las Vegas.  Because all six servers reside in the same AD site, Exchange will bind to any one of the four GCs.  In this example, NODE1 is active and NODE2 happens to be using SFDC1 for Global Catalog and Configuration Domain Controller services.  During this time, NODE2 is reaching across the WAN for GC services, which is not very efficient.

If there is a location specific outage in San Francisco (earthquake, power interruption, or some yahoo takes out a fiber trunk with a backhoe) the CCR cluster will fail over to Las Vegas, but the GC NODE2 is using (SFDC1) is unavailable, too.  Exchange services will not fail over correctly and an outage occurs — something that the CCR cluster is supposed to prevent.

The way to design around this problem is to configure the CCR node in each location to exclude the GCs in the remote location.  This is done using the following command from the Exchange Management Console, as shown for NODE2:

Set-ExchangeServer -id NODE2 -StaticExcludedDomainControllers:sfdc1.domain.com,sfdc2.domain.com

Note that the Domain Controllers specified must be in FQDN form, separated by commas, with no spaces.   You would do the same for NODE1, specifying LVDC1 and LVDC2.

The result is that each node will always use the local GCs for that node.  If both of those local GCs are unavailable for some reason, Exchange will temporarily bind to any GC in a remote site in the domain.  This binding will occur automatically within 15 minutes.  When the local GCs become available again, Exchange will re-bind to them within 15 minutes.  Perfect!



While researching this article, I came across something unexpected.  I set the StaticExcludedDomainControllers value using the Set-ExchangeServer cmdlet and it works as expected.  But when I try to view the configuration using the Get-ExchangeServer cmdlet, the value appears empty, as shown:


The reason it shows null is because the StaticDomainControllers, StaticGlobalCatalogs, StaticConfigDomainController, and StaticExcludedDomainControllers variables are stored in the Exchange server’s registry, not in Active Directory.  According to Microsoft, this is “by design” to prevent performance issues caused by the Remote Registry call needed to retrieve the values.  I’m not aware of any other cmdlet that has this behavior.

In any event, to view the configuration of these variables you must use the -Status switch, as shown: