AD Site Design and Auto Site Link Bridging, or Bridge All Site Links (BASL)

By Ace Fekay, MCT, MVP, MCSE 2012/Cloud, MCITP EA, MCTS Windows 2008/R2, Exchange 2007, 2010 & 2013, Exchange 2013, Exchange 2010 Enterprise Administrator, MCSE 2003/2000, MCSA Messaging 2003
  Microsoft Certified Trainer
  Microsoft MVP: Directory Services
  Active Directory, Exchange and Windows Infrastructure Engineer

Updated 12/12/2013

Preface

Ace here again with something I really would like to discuss, since this topic comes up from time to time.

To properly designed an AD multi-site infrastructure, there are a few things that need to be taken into account. I won’t bore you with all the background techno babble, rather I’m going to discuss a no-nonsense, get down to business on why you need to either keep Auto Site Link Bridging enabled, or why you need to disable it, both of which depends on your physical routed topology design.

AD Sites

First, a basic understanding of Active Directory Sites is important to understand before I go further.

Some of the biggest questions I hear about AD Sites are:

  • What are AD Sites?
  • What are AD Sites for?
  • Why can’t I create an AD Site without a domain controller in the Site?

These are all valid questions. A little research will usually result in an answer, but you may have to dig through piles of technical details to get to it. Let’s address each one:

What are AD Sites?

An AD Site defines a highly-connected, physical network locations in Active Directory. We define them by IP subnet or subnets. And yes, you can have multiple subnets that are highly-connected by routers within a location. In some cases, for example, if you have a very high-speed backbone, such as an OC-1 (51.84Mbps or higher), between locations, you can put all those subnets in one AD Site. However, in many cases, we probably don’t want to do that. Hang in there, I’ll be getting to that in a few minutes.

What are AD Sites for?

AD sites are basically used for two things:

  1. To facilitate service localization. In simple English, this means to control logon and authentication traffic to DCs in a specific location, or Site. After all, we don’t want a client in NYC to pick a DC in Seattle, Japan, or somewhere else, to send its logon or authentication request (such as when accessing a folder), do we? Nope. The client side DC Locator process will find a DC in its own Site by using the client’s IP address.
  2. To manage DC replication traffic. In simple English, this means to control DC replication traffic across WAN links between the bridgeheads (the DCs in a Site that communicate with DCs in other Sites). By default, replication between Sites are compressed down to 15% of total traffic. And with Sites, we can control frequency of replication, and when we’re allowing it to happen.’

Why can’t I create an AD Site without a DC in the Site?

Good question. If you look at what AD Sites are for, then it should be pretty obvious that you need a DC in it. After all, if there is no DC in it, and a client picks a Site based on its IP subnet, then looks for a DC, it won’t find any, will it? Nope, so it may wind up randomly picking a DC in another location, such as that DC in Seattle or Japan.

DC Locator Process

I don’t want to dwell on this, but I will briefly mention it because this is part of the reason why we want to create AD Sites anyway.

There is a process that a client uses to pick a DC. Here’s a quick view of how a client picks a DC. I should add a #9 to the list in a scenario when no DC exists in the Site, then it uses Automatic Site Coverage, and this ONLY if you created an IP Site link to another Site that you want the DCs to cover that site.

If you didn’t create an IP Site link for a Site that has no DCs, then it will pretty much become a random process, sort of, by using other factors, such as subnet netmask ordering and Round Robin. If you want to read up on this subject, here are two good TechNet Forum discussions on it:

Briefly, here are the DC Locator process steps, and these steps were directly quoted from How Domain Controllers Are Located in Windows XP

  1. Client does a DNS search for DC’s in _LDAP._TCP.dc._msdcs.domainname
  2. DNS server returns list of DC’s.
  3. Client sends an LDAP ping to a DC asking for the site it is in based on the clients IP address (IP address ONLY! The client’s subnet is NOT known to the DC).
  4. DC returns…
    1. The client’s site or the site that’s associated with the subnet that most matches the client’s IP (determined by comparing just the client’s IP to the subnet-to-site table Netlogon builds at startup).
    2. The site that the current domain controller is in.
    3. A flag (DSClosestFlag=0 or 1) that indicates if the current DC is in the site closest to the client.
  5. The client decides whether to use the current DC or to look for a closer option.
    1. Client uses the current DC if it’s in the client’s site or in the site closest to the client as indicated by DSClosestFlag reported by the DC.
    2. If DSClosestFlag indicates the current DC is not the closest, the client does a site specific DNS query to: _LDAP._TCP.sitename._sites.domainname (_LDAP or whatever service you happen to be looking for) and uses a returned domain controller.

Brief overview:

For a full-sized image, click on the images.

Let me point out again, that if there are no DCs in a Site, then Automatic Site Coverage will take over.

To me, it’s a process to “find” a DC that will authenticate a user in a Site without a DC. However, my take on it is I would rather associate the location’s subnet to a current Site so as to not make the client go through this process. Besides, there may be scenarios that not having a DC in a Site can directly affect directory enabled applications and services such as DFS site referrals, SCCM or Exchange with it’s high dependency on GCs and DSAccess.

Here’s the DC Locator process, directly quoted from the Technet article, “How DNS Support for Active Directory Works:”

  1. Build a list of target sites — sites that have no domain controllers for this domain (the domain of the current domain controller).
  2. Build a list of candidate sites — sites that have domain controllers for this domain.
  3. For every target site, follow these steps:
    1. Build a list of candidate sites of which this domain is a member. (If none, do nothing.)
    2. Of these, build a list of sites that have the lowest site link cost to the target site. (If none, do nothing.)
    3. If more than one, break ties (reduce this list to one candidate site) by choosing the site with the largest number of domain controllers.
    4. If more than one, break ties by choosing the site that is first alphabetically.
    5. Register target-site-specific SRV records for the domain controllers for this domain in the selected site.

If there are no DCs in a Site, you can use PowerShell to figure out which DC in which Site will be picked. If you like, you can further read up on the commands used to figure this out in Sean Ivey’s blog:

Sites Sites Everywhere…, By Sean Ivey, Microsoft DS PFE
http://blogs.technet.com/b/askds/archive/2011/04/29/sites-sites-everywhere.aspx

So wouldn’t you want your clients to pick a DC in its own Site?

Moving forward, do we really want a client to pick a DC in some other site or go through the Automatic Site Coverage process? Would you want that? I ‘m sure you already know the answer to that.

Therefore, if you have a location that have no DCs, then simply create an IP subnet object, and associate the subnet object to an existing AD Site that you want those users to use. In this case, you may base your own pick on a site linked by the fastest WAN link, or the only WAN link.

And if there are any subnets that are not associated with an AD site, then any DC is game to authenticate a client, as seen in the process above. To check for clients which subnets are not configured to AD Sites & Services, among other things, enable Netlogon logging, and check the system32\config\netlogon.log file. Here’s more info:

Enabling debug logging for the Net Logon service, Last Review: May 3, 2011 – Revision: 11.0, Applies to: all operating systems.
 http://support.microsoft.com/kb/109626

Auto Site Link Bridging

This now brings us to bridging, what it is, etc.

Within an AD Site, the KCC (Knowledge Consistency Checker) will automatically assume that all DCs can directly reach each other, and create Intrasite replication partnerships between the DCs in the Site. The one point that I want to be clear about that no matter how many DCs are in a Site, and there can be hundreds of DCs in a Site, the KCC will make sure that the  partnerships created are done so that all DCs in a Site will have an updated replication set for any changes by any of the DCs in the site, within 15 minutes. If you add a new DC to the Site, the KCC jumps in and evaluates the new guy and adds it so it gets updated data from other DCs under 15 minutes. How does it do that? It follows a set algorithm, but that is beyond this discussion.

When there are multiple Sites, and more specifically three or more Sites, and keeping in mind that by default AD automatically assumes that all the Sites have direct physically connectivity and communications between each other. This means you can literally ping a DC from in any Site to any other Site.

Here’s where the ISTG (Intersite Topology Generator) kicks in. The ISTG is a component of the KCC. It evaluates the overall topology, and builds connection objects between servers in each of the sites to enable Intersite replication— DC replication between sites.

Here’s a fully routed infrastructure, For the full-sized image, click here.

 

If remote sites cannot directly communicate with each other and only to the hub site

However, if your physical network topology is designed where each site does not have direct communications with each other, and you leave all the default “Auto Site Link Bridge” setting enabled as is, then lots of things will go wrong, such as replication problems, duplicate AD integrated zones, and more … keep reading. But I won’t address duplicate zones. You can click the link in the previous sentence for more on that.

If the network topology was a hub and spoke and BASL wasn’t disabled and individual sites links between the hub and each site weren’t created until recently, then there may be replication problems. This is a whole different subject. What I can say, besides checking to see if there are duplicate zones, as I mentioned in the previous paragraph, I would also run the Active Directory Replication Status Tool to check replication status. It will provide a report, and anything amiss will show up in Red. Pretty cool tool. Download it here:

Download The Active Directory Replication Status Tool (ADREPLSTATUS):
   http://www.microsoft.com/en-us/download/details.aspx?id=30005
     Note: This tool requires .Net Framework 4. If it’s not installed, download and install it:
       Microsoft .NET Framework 4 (Web Installer)
       http://www.microsoft.com/en-us/download/details.aspx?id=17851

Remember, by default, the KCC assumes all sites can directly communicate, therefore it will create partnerships between Bridgeheads in all sites. And any DC in a site can automatically become a bridgehead.

So if corporate headquarters is in NYC, and you have three remote locations, Miami, Chicago and Seattle, and direct communications does not exist, meaning that each remote location can only communicate with headquarters, and IP routing has not been configured between the remote locations, and the KCC creates a connection object (partnership) between a DC in Miami and a DC in Seattle, what will happen?

Since they can’t directly communicate, then replication fails. And if the Seattle partnership is the only connection object Miami may have, but Seattle happens to have one to NYC, and keeping in mind, replication is a PULL request, then Seattle will receive replication from NYC, but Miami can’t pull anything from Seattle, because there is no direct or indirect communications. So Miami winds up being in a secluded island.

In Miami’s DC’s view, it thinks no one wants to talk to it, so it will complain (you will see multiple event log errors) that others having replicated with it. And according to the DCs in the other sites, they will all think the same thing about Miami.

So who’s right? Of course, they all are. If the lack of replication goes beyond the AD Tombstone, then Miami would need to be demoted. Then again, you can’t even do that because it doesn’t have direct communications with its partner. Then if it does pick a DC in headquarters to demote, you will see an error stating that the headquarters DC already thinks the DC no longer exists. In the case of trying to demote it, or even forcedemoting it beyond the Tombstone, then your only option is to unplug it, run a metadata cleanup and re-promote it. But wait, then the same thing will occur if you don’t disable BASL.

So is it right that we do the same thing over and over and expect different results? Nope. Let’s configure AD to make sure it will not happen again, by disabling BASL.

Here’s a non-fully routed infrastructure. For the full-sized image, click here.

Disable BASL

Simply put, what we need to do is disable BASL (Bridge All Site Links) in a non-fully routed infrastructure to tell the KCC to only partner DCs across a specific site link.

Yes, that means you also have to create specific IP site links between headquarters in NYC to each site, as the image above shows.

And even if you have 20 sites all fully routed EXCEPT for one of them, then the same thing goes. You must disable it all because of that one site, otherwise the KCC will partner with a DC that it may not have direct communications with.

How to disable BASL. For the full-sized image, click here.

Summary

If you want to make sure your AD infrastructure is properly purring along and doing its job, then by all means let’s design it properly, make the necessary modifications, and other changes, to get it going in the right direction.

Oh, and you can’t forget to bone up on your DNS knowledge and how it supports AD. All Sites get registered in DNS by the netlogon service. Read more:

How DNS Support for Active Directory Works
http://technet.microsoft.com/en-us/library/cc759550(WS.10).aspx

And to understand the DNS SRV records registered by a DC’s Netlogon service, read Sean Dubey’s blog, with DNS SRV records examples, once again, I refer you to Sean Ivey’s blog:

Sites Sites Everywhere…, By Sean Ivey, Microsoft DS PFE
http://blogs.technet.com/b/askds/archive/2011/04/29/sites-sites-everywhere.aspx

References

Designing the Site Topology
http://technet.microsoft.com/en-us/library/cc787284(WS.10).aspx

Detailed branch office deployment guide (downloadable doc)
http://www.microsoft.com/downloads/details.aspx?FamilyId=9353A4F6-A8A8-40BB-9FA7-3A95C9540112&displaylang=en

Best Practice Active Directory Design for Managing Windows Networks
http://technet.microsoft.com/en-us/library/bb727085.aspx

You may want to take a look at the design IPD guide (Infrastructure Planning and Design) for AD – Download Details: IPD guide for Active Directory Domain Services – version 1.0
http://www.microsoft.com/en-us/download/details.aspx?id=732

Download the complete Infrastructure Planning and Design (IPD) Guide Series v2.0 including links for AD IPD, SCCM IPD, and more.
http://technet.microsoft.com/en-us/library/cc196387.aspx

Comments & Corrections are welcomed.

Ace Fekay

Nslookup suffixing behavior

By Ace Fekay, MCT, MVP, MCSE 2012/Cloud, MCITP EA, MCTS Windows 2008/R2, Exchange 2007 & 2010, Exchange 2010 Enterprise Administrator, MCSE 2003/2000, MCSA Messaging 2003
Microsoft Certified Trainer
Microsoft MVP: Directory Services
Active Directory, Exchange and Windows Infrastructure Engineer

Original compilation: 2/17/2013

 

Prologue

Many IT folks who are not familiar with nslookup’s suffixing behavior in some cases may believe it’s a DNS issue. Nope, it’s not a DNS issue, rather a combination of nslookup’s suffixing behavior, which DNS server nslookup is using, Forwarders if configured, and the operating system’s Search Suffixes.

 

NSLOOKUP and requiring a trailing dot

Keep in mind, nslookup’s resolver service has its own built-in resolver service and is totally *independent* of the operating system’s client side resolver algorithm, (although it will use the machine’s suffixes to devolve names), and will behave differently than if you were to say ping a host by single name.

When using nslookup, you need to fully qualify the name (querying an FQDN), instead of a single name, then you must supply a trailing dot with the query.

If not, it will append the current context, that is the suffix(es) configured on the machine, which it will suffix each one in the order they are configured.

If you want to use a better tool for nameserver queries, I suggest to use DIG. DIG is downloadable as part of ISC’s BIND DNS server. You can  download BIND for free from https://www.isc.org/wordpress/. Expand the files into a folder, and the tools will be available for use. No, this doesn’t mean you have to install the BIND DNS server service, I’m just suggesting to download and use the utilities in the folder. Matter of fact, BIND also has its own version of nslookup that some say works better than Microsoft’s nslookup, but I haven’t found that true. I’ve found DIG very beneficial when trying to troubleshoot DNS issues.

Additional nslookup information

Here are some links explaining nslookup’s behavior. The first one is a doc that explains more of this in greater detail. This doc actually was compiled from KB200525, the second link, which is also mentioned in the Microsoft Official Curriculum Course# 688, “Using TCP/IP,” Courseware.

Using NSlookup (File Format: Microsoft Word) – “Nslookup will always devolve the name from the current context. If you fail to fully qualify a name query (that is, use trailing dot), the query will be …; “
http://mcse.villanova.edu/Courses/688/documents/Using%20NSlookup.doc

Using NSlookup.exe
http://support.microsoft.com/?id=200525

Using NSlookup – (Microsoft Word Doc)
”Nslookup will always devolve the name from the current context. If you fail to fully qualify a name query (that is, use trailing dot), the query will be … “
http://mcse.villanova.edu/Courses/688/documents/Using%20NSlookup.doc

Nslookup, Sep 28, 2007 … This applies when the set and the lookup request contain at least one period, but do not end with a trailing period. Nslookup /set srchlist …
http://technet.microsoft.com/en-us/library/cc725991(WS.10).aspx

As the last link suggests, you can run nslookup with the /set srchlist, such as nslookup /set srchlist to set your own search lists that changes the default search suffix nslookup uses that it grabs from the operating system’s Search Suffixes. You can also set it in interactive mode by the following and leaving it blank to remove any search suffixes it’s pulling from the machine:

nslookup
> set srchlist

 

Will removing the Primary DNS Suffix affect AD functionality?

Yes and No. Yes if you remove the Primary DNS Suffix, which the default search list comes from and the machine uses in such cases as DirectSMB connectivity, among other things. And no, nslookup’s requirement of using a dot doesn’t affect or indicate any issues with AD, it’s just an nslookup thing.

 

In summary:

No, it’s not something that’s saying there is a DNS problem. To determine if you have a DNS problem, I suggest to use nslookup querying FQDNs with a trailing dot, or better, download and use DIG.

Further, you will need to use the trailing dot (a period) unless you remove the search suffix. You can also remove the suffix from the machine, and it will work without a trailing dot. But the search suffix is derived from the Primary DNS Suffix, which is set by the domain it’s joined to. You can remove it in the registry and not touch the Primary DNS Suffix.

You can also uncheck the computer’s client side resolver behavior, as shown in this screenshot (https://utgkjq.sn2.livefilestore.com/y1ppjK9K5o-JVAQJqWMjf9NSpoI9kTGnkjX_q5PGS3whQEFD-TPNXHMC0PU8rKjKt3AKPD5kuN0k9MyqK2I2sXd0mD2DSiTFiF0/DNS%20-%20Stop%20Suffix%20from%20Appending.jpg?psid=1).

 

Additional links to read on this subject:

Thread: “Weird NSLOOKUP results” 6/10/2010
http://social.technet.microsoft.com/Forums/sk/winserverNIS/thread/8f29df1a-46dc-4b3b-946c-528b10f7223e

Windows Appending Domain Suffix To All Lookups
http://serverfault.com/questions/74067/windows-appending-domain-suffix-to-all-lookups

Thread: “DNS server strange behavior” 2/9/2013
http://social.technet.microsoft.com/Forums/en-US/winserverNIS/thread/e3c9bc21-5037-4974-9329-fb86cf670494/

It’s just something to keep in mind when using nslookup.

I hope you find this info helpful.

Ace Fekay

Comments, corrections and suggestions are welcomed.