Category Archives: 4549

Exchange Server 2007 Hub Transport (HT) and Client Access Service (CAS) on the Same NLB Cluster – Updated Jan 9, 2008

In order to keep the number of servers down in a high availability environment, administrators have been looking at using Network Load Balancing (NLB) for CAS and then co-locating the HT role on each node of the NLB cluster to also provide high availability for the HT role.


This configuration can work, and it really is not too difficult to configure. It is extremely important to note that using NLB to load balance the default SMTP receive connectors (using port 25) is not supported and is completely unnecessary since they are load balanced for all intra-Exchange communications like HT to HT communications. However, using NLB to provide redundancy and load balancing for connections to  HTs that are hosting Client SMTP receive connectors (using port 587) is fully supported and may be desireable if you have a large number of external SMTP/POP and SMTP/IMAP clients that need to connect to this receive connector.


The steps that you need are to:




  1. Setup two servers running Windows Server 2003 with two NICs in each server


  2. Install Exchange Server2007 Hub Transport and Client Access Service (CAS) on each server


  3. Configure one NIC for the Network Load Balance cluster and setup the other NIC in a separate network so it can be managed through that IP address


  4. Configure NLB with Unicast and even load balancing


  5. Setup the port rules:



    • Port 25 to 25 for both TCP and UDP and select the radio button to disable this port range (this will exclude port 25 from being listed to using the virtual IP address of the NLB cluster, but still allow the individual server IPs to still listen to port 25)


    • Port 465 to 465 for both TCP and UDP and selected the radio button to disable this port range


    • Port 80 to 80 for both TCP and UDP and set affinity to none (I recommend “none” so you can easily test and verify that it works)


    • Port 587 to 587 for both TCP and UDP, affinity none (this is for the client SMTP receive connector)


    • Port 443 to 443 for both TCP and UDP, affinity none


    • Port 110 to 110 for both TCP and UDP, affinity none


    • Port 993 to 993 for both TCP and UDP, affinity none


    • Port 143 to 143 for both TCP and UDP, affinity none


    • Port 995 to 995 for both TCP and UDP, affinity none


  6. With affinity set to none, you can more readily test the CAS (after updating the web pages to show which server is actually responding) and verify that the load is being shared. You can also test to make sure the NLB cluster does not respond to SMTP on port 25, which it shouldn’t if you set it right, and verify that each server does respond to SMTP as an individual server name.


  7. You can configure protocol logging for the other protocols and telnet to the ports using the NLB IP address to see if they are loading balancing like they should. You can also use the NLB IP for the testing by sending and receiving messages and checking the message tracking logs to see that the traffic was being balanced. It all worked.

NOTE: You may want to change affinity to either single (especially if it is being used internally) or Class C (especially if it is accessible from the Internet) once your testing is done.


Good luck, and have lots of fun!

Xerox FreeColorPrinters
 

 

Network Load Balancing and MAC Addresses

I learned something new yesterday. It kind of flipped me out, but now it almost makes sense.

 

You can try this to confirm.


  1. From a client, ping the IP address of your NLB cluster.
  2. From the same client, run arp -a fom the command prompt.

You should see something like this (I will assume 192.168.2.11 for the NLB cluster IP address):

    Internet Address         Physical Address      Type

    192.168.2.11            02-bf-c0-a8-02-0b     Dynamic

 

It will list other addresses and their MACs as well, but we are only interested in the NLB address. 02-bf-c0-a8-02-0b breaks down into nice little components like so:


  • The first number is the type of NLB configuration: 01=IGMP, 02=Unicast, 03=Multicast
  • The second number, (bf), is unknown in its origin, but it is the same for all NLB configurations
  • The next four numbers are the IP address, i.e. c0=192, a8=168, 02=2, 0b=11 and thus the IP of 192.168.2.11.

OK, I already knew all of this. It is the following that was new to me.

 

It is the second set of numbers, bf, that is interesting to me. I can’t find anything that tells me why bf is used, but it is always used when arp requests the MAC from the NLB IP address. Why I find it interesting is that it is not used at all when the NLB nodes send GARPs or when they return traffic. What each NLB node does, when sending traffic, is it spoofs the MAC as above except it replaces BF with the priority number. For example, if the NLB cluster node were configured with the number three as its priority (unique) number, then it would identify itself to the switch as being MAC address 02-03-c0-a8-02-0b. This allows the switch to happily enter the MAC Address in its table and have a one to one mapping of MAC Addresses to ports.

 

So, when an NLB client tries to connect to the IP address of the NLB cluster and does an ARP on the IP to identify the MAC Address, the switch fabric flips out because it can’t find any ports that contain that MAC address and thus flood the fabric. The use of the priority number stops the switch fabric from trying to learn the actual MAC address of the NLB cluster and provides a bit of sanity/reality for the switch so that it is happy. 

 

So, to summarize, each client connecting to the NLB cluster will use the bf MAC address as the destination which causes the switches to flood all ports with the traffic. Each NLB node sends data using the priority number instead of bf to stop the switch from learning the bf MAC address and trying to map it to a single port. 

 

Of course, all of this leads us to the question about switch flooding and how to limit it. For this information see my blog entry on Unicast vs. Multicast.     

 

Network Load Balancing (NLB) and Network Interface Card (NIC) Teaming

The quick summary of this post is, “Don’t use NLB on teamed NICs.”


Microsoft clearly says that NIC teaming “may” cause problems with NLB in KB 278431.


This is where things get confusing, because the issue is just that; it may be a problem. The reasoning is really fairly simple. Teaming software, in many cases, overwrites the MAC address of the individual NICs in the team. Well, NLB, in Unicast, also overwrites the MAC address. So, the problem is:



  • Will the teaming software allow the overwrite behavior of Unicast?

  • Will the teaming software handle the failure of a NIC in the team and the overwrite process of NLB in the event of a NIC failure?

The answers are, it might not allow the overwrite in Unicast, and it might not behave properly in the event of a NIC failure and passing of the MAC to the other NIC in the team. Thus, the “may” statement earlier.


The way it needs to work is that teaming software for NICs nees to support the overwrite of MAC addresses. Many vendors do now provide this support. A workaround exists allowing the team MAC address to be set directly through the management tool. Compaq/HP, for example, defaults to the MAC address of the primary adapter. After NLB sets the MAC on the virtual adapter (the NIC team), the Compaq/HP software does not propagate the MAC address to the physical adapters. To make it work, you have to copy the NLB MAC and paste it into the team MAC in the management software. Workarounds and High Availability environments can not be used in the same sentence, thus, this is not a best practice.


My contention is simple: Since we can’t guarantee transparncy of failure of the team and how it allows NLB overwrites of the MAC (this is a hardware driver issue that Microsoft can not guarantee will behave properly), it should be considered a best practice to not use teaming for NLB NICs.


By the way, this behavior does not change in Longhorn.

Unicast vs. Multicast – Original Posted Feb 21, 2005

As usual, confusion motivates me to blog some more. In this case, I have blogged this because I was confused, and I am pretty sure that I have it straight now. Comments may prove me wrong.


When designing, planning, testing, and implementing Network Load Balancing (NLB) Clustering, a choice has to be made regarding unicast vs. multicast. There are a few differences, but the main difference is in the way MAC addresses are implemented.


Unicast – Each NLB cluster node replaces its real (hard coded) MAC address with a new one (generated by the NLB software) and each node in the NLB cluster uses the same (virtual) MAC. Because of this virtual MAC being used by multiple computers, a switch is not able to learn the port for the virtual NLB cluster MAC and is forced to send the packets destined for the NLB MAC to all ports of a switch to make sure packets get to the right destination.


So, basically, the way NLB traffic is handled is kind of like this:


1. An inbound packet for IP address w.x.y.z (NLB Virtual IP) arrives
2. The ARP request is generated and is sent across all ports of the switch since there is no mapping at this point
3. All of the NLB cluster nodes respond with the same MAC
4. The switch sends the traffic to all ports because it is not able to tell which is the proper port and this leads to switch flooding


If an NLB cluster node is using unicast, NLB isn’t able to tell each node apart as they all have the same MAC. Since each NLB cluster node has the same MAC, communication between NLB cluster nodes is not possible unless each NLB cluster node has an additional NIC with a unique MAC.


Multicast – NLB adds a layer 2 MAC address to the NIC of each node. Each NLB cluster node basically has two MAC addresses, its real one and its NLB generated address. With multicast, you can create static entries in the switch so that it sends the packets only to members of the NLB cluster. Mapping the address to the ports being used by the NLB cluster stops all ports from being flooded. Only the mapped ports will receive the the packets for the NLB cluster instead of all ports in the switch. If you don’t create the static entries, it will cause switch flooding just like in unicast.


Flooding Solutions:



  1. Hook all NLB devices to a hub and then connect it to a port on the switch. Since all NLB nodes with the same MAC come through the same port, there is no switch port flooding.

  2. Configure a VLAN for all NLB cluster nodes to contain all NLB cluster traffic to just the VLAN and not run it over the entire switch.

  3. Use multicast and configure static mapping for the NLB cluster nodes in the switch so it only floods the mapped ports instead of the entire switch.

  4. Use port mirroring so that all ports involved in the NLB cluster mirror each other.

NLB Unicast vs. Multicast – Original Posted Feb 21, 2005

As usual, confusion motivates me to blog some more. In this case, I have blogged this because I was confused, and I am pretty sure that I have it straight now. Comments may prove me wrong.


When designing, planning, testing, and implementing Network Load Balancing (NLB) Clustering, a choice has to be made regarding unicast vs. multicast. There are a few differences, but the main difference is in the way MAC addresses are implemented.


Unicast – Each NLB cluster node replaces its real (hard coded) MAC address with a new one (generated by the NLB software) and each node in the NLB cluster uses the same (virtual) MAC. Because of this virtual MAC being used by multiple computers, a switch is not able to learn the port for the virtual NLB cluster MAC and is forced to send the packets destined for the NLB MAC to all ports of a switch to make sure packets get to the right destination.


So, basically, the way NLB traffic is handled is kind of like this:


1. An inbound packet for IP address w.x.y.z (NLB Virtual IP) arrives
2. The ARP request is generated and is sent across all ports of the switch since there is no mapping at this point
3. All of the NLB cluster nodes respond with the same MAC
4. The switch sends the traffic to all ports because it is not able to tell which is the proper port and this leads to switch flooding


If an NLB cluster node is using unicast, NLB isn’t able to tell each node apart as they all have the same MAC. Since each NLB cluster node has the same MAC, communication between NLB cluster nodes is not possible unless each NLB cluster node has an additional NIC with a unique MAC.


Multicast – NLB adds a layer 2 MAC address to the NIC of each node. Each NLB cluster node basically has two MAC addresses, its real one and its NLB generated address. With multicast, you can create static entries in the switch so that it sends the packets only to members of the NLB cluster. Mapping the address to the ports being used by the NLB cluster stops all ports from being flooded. Only the mapped ports will receive the the packets for the NLB cluster instead of all ports in the switch. If you don’t create the static entries, it will cause switch flooding just like in unicast.


Flooding Solutions:
1. Hook all NLB devices to a hub and then connect it to a port on the switch. Since all NLB nodes with the same MAC come through the same port, there is no switch port flooding.
2. Configure a VLAN for all NLB cluster nodes to contain all NLB cluster traffic to just the VLAN and not run it over the entire switch.
3. Use multicast and configure static mapping for the NLB cluster nodes in the switch so it only floods the mapped ports instead of the entire switch.