WINS is a Friend of Mine – Original Posted Feb 22, 2005

Yes, I like WINS, and WINS likes me. I recently went through another string of issues with a few clients and the resolution seemed to just scream WINS when I took my first glance at the problems. A click here, a click there, and *poof* you have a WINS server. I recommend at least two that replicate for redundancy.


I don’t know who started the whole “Windows 2000 will make it so that WINS can go away” mantra, but it just isn’t true. WINS has a real purpose in life. The purpose is not so that Windows can connect to the Internet. I actually heard a trainer (a very inexperienced trainer that was obviously pulling the explanation out of his @ss) state that the only time you need WINS is if you are connecting to the Internet. Somehow, I managed to not go into the classroom and biatch slap him right there in front of his students. WINS is used for NetBIOS name resolution. That sounds pretty vague to many people, but what it comes down to is that when an application requires NetBIOS name resolution, it will either use a NetBIOS name server (i.e. WINS) or it will broadcast on its network segment to try and resolve the name. Since we all know that broadcast traffic is evil, we should all be implementing WINS in our environments if we use any of the following:



  • Exchange 2000 setup program

  • Exchange 2000 System Manager

  • Exchange Server 2003 setup program

  • Exchange Server 2003 System Manager

  • Exmerge

  • Network Neighborhood

  • Outlook (except for Outlook 2003)

  • OWA – in order to change a password

  • Windows 2000 Server Clustering

  • Windows 9x Clients (please tell me that you don’t have anything that old around)

  • Windows NT 4.0 Workstation Clients (ditto)

  • Windows Server 2003 Server Clustering

http://support.microsoft.com/?kbid=837391 discusses most of the above applications and their use of NetBIOS naming. There are many other times you should implement WINS. For example, you might have a mixed Exchange environment with 5.5 still loose and wreaking terror on your life. I just don’t have the memory to remember all of the situations.


All of the above use NetBIOS naming and NetBIOS name resolution. It doesn’t mean that they have to use WINS, but it sure makes sense to use WINS in these situations since broadcasting only will resolve names on the same network segment (in most cases). Before I start hearing the screams of those that want to rid themselves of WINS, I fully acknowledge that many applications that use NetBIOS naming can be configured to work without WINS. But why go through all of that pain when you can use WINS, and it is so easy to install and maintain? I know, there are some people that really get off on pain. Let’s remember to keep our distance. <G>


By the way, I bet many of you didn’t know that there is an awesome tool that is fantastic for troubleshooting NetBIOS resolution. Take a look at nblookup when you get a chance. It is fantastic. http://support.microsoft.com/Default.aspx?kbid=830578

NLB Unicast vs. Multicast – Original Posted Feb 21, 2005

As usual, confusion motivates me to blog some more. In this case, I have blogged this because I was confused, and I am pretty sure that I have it straight now. Comments may prove me wrong.


When designing, planning, testing, and implementing Network Load Balancing (NLB) Clustering, a choice has to be made regarding unicast vs. multicast. There are a few differences, but the main difference is in the way MAC addresses are implemented.


Unicast – Each NLB cluster node replaces its real (hard coded) MAC address with a new one (generated by the NLB software) and each node in the NLB cluster uses the same (virtual) MAC. Because of this virtual MAC being used by multiple computers, a switch is not able to learn the port for the virtual NLB cluster MAC and is forced to send the packets destined for the NLB MAC to all ports of a switch to make sure packets get to the right destination.


So, basically, the way NLB traffic is handled is kind of like this:


1. An inbound packet for IP address w.x.y.z (NLB Virtual IP) arrives
2. The ARP request is generated and is sent across all ports of the switch since there is no mapping at this point
3. All of the NLB cluster nodes respond with the same MAC
4. The switch sends the traffic to all ports because it is not able to tell which is the proper port and this leads to switch flooding


If an NLB cluster node is using unicast, NLB isn’t able to tell each node apart as they all have the same MAC. Since each NLB cluster node has the same MAC, communication between NLB cluster nodes is not possible unless each NLB cluster node has an additional NIC with a unique MAC.


Multicast – NLB adds a layer 2 MAC address to the NIC of each node. Each NLB cluster node basically has two MAC addresses, its real one and its NLB generated address. With multicast, you can create static entries in the switch so that it sends the packets only to members of the NLB cluster. Mapping the address to the ports being used by the NLB cluster stops all ports from being flooded. Only the mapped ports will receive the the packets for the NLB cluster instead of all ports in the switch. If you don’t create the static entries, it will cause switch flooding just like in unicast.


Flooding Solutions:
1. Hook all NLB devices to a hub and then connect it to a port on the switch. Since all NLB nodes with the same MAC come through the same port, there is no switch port flooding.
2. Configure a VLAN for all NLB cluster nodes to contain all NLB cluster traffic to just the VLAN and not run it over the entire switch.
3. Use multicast and configure static mapping for the NLB cluster nodes in the switch so it only floods the mapped ports instead of the entire switch.

My High Availability Definition – Original Posted Feb 11, 2005

No, really, I am not trying to piss off all of my Microsoft friends. I just don’t like the Microsoft definition of HA.


I gave a lot of thought before I decided to blog on this one because I know I am going against the established grain with my method of explaining HA. However, I am used to being spanked in public, so I can take some more spankings if needed. Actually, I think I like being spanked.


High Availability is the combination of well defined, planned, tested, and implemented processes, software, and fault tolerant hardware focused on supplying and maintaining application availability.


For Example: As a high level example, consider messaging in an organization.



BAD – A poor implementation of Exchange is usually slapped together by purchasing a server that the administrator feels is about the right size and installing Exchange Server 2003 on it. Messaging clients are installed on network connected desktops and profiles are created. The Exchange server might even be successfully configured to connect to the Internet. I have seen Exchange environments installed in organizations over a short business week and even over night in some cases. It is easy to do it fast and get it done, but lost of important details are missed.


GOOD – In an HA environment, the deployment is well designed. Administrators research organizational messaging requirements. Users are brought into discussions along with admins and managers. Messaging is considered as a possilble solution to many company ills. Research may go on for an extended period as consultants are brought in to help build a design and review the design of others. Vendors are brought in to discuss how their products (Antivirus and content management for example) are going to help keep the messaging environment available and not waste messaging resources processing spam and spreading viruses (or is that virii?). Potential 3rd party software is tested and approved after a large investment of administrator and end user time. Hardware is sized and evaluated based on performance requirements and expected loads. Hardware is also sized and tested for disaster recovery and to meet service level agreements for both performance and time to recovery in the case of a disaster. Hardware selected will often contain fault tolerant components such as redundant memory, drives, network connects, cooling fans, powersupplies, and so on. An HA environment will incorporate lots of design, planning, and testing. An HA environment will often, but not always, include additional features such as server clustering which decreases downtime by allowing for rolling upgrades and allowing a preplanned response to failures. A top-notch HA messaging environment will also consider the messaging client and its potential configurations that lead to increased availability for users. For example, Outlook 2003 offers a cache mode configuration allowing users to create new messages, respond to existing mail in their in-box, and manager their calendars (amongst many other tasks) without having to maintain a constant connection to the Exchange server. Cache mode allows users to continue working even though the Exchange server might be down. It also allows for more efficient use of bandwidth.


The Goal – Now this is where many people disagree. I consider the goal of all HA environments to really be continuous availability (CA) of applications and resources for employees. Doesn’t everyone want email to always be available processing messaging traffic and helping the people in the organization collaborate? Of course that is what we want. We want applications and their entire environment to continue runing forever.


In my opinion, we strive for CA and we settle for HA.



“In information technology, high availability refers to a system or component that is continuously operational for a desirably long length of time. Availability can be measured relative to “100% operational” or “never failing.” A widely-held but difficult-to-achieve standard of availability for a system or product is known as “five 9s” (99.999 percent) availability.”

Source:
http://searchcio.techtarget.com/sDefinition/0,,sid19_gci761219,00.html


Obviously, “continuously operational” just isn’t possible over extremely long periods of time. Hardware will always fail, it is just a matter of when. Software becomes obsolete over time, too. We all need to understand that HA includes not just the hardware and software solution, but it also includes the backup/restore solution, and it includes failover processing. Most HA experts will also add that a true HA environment includes a well documented development, test, and production migration process for any and all changes to be made in production environments. There is much to achieving HA, however, it simply comes down to application availability through well designed, planned, tested, and implemented processes, software, and hardware.

Another Example would be if you use NLB to provide application availability to your users over the Internet for your web based app. NLB helps keep the application available to your users. The same can be said for server clustering, however, you need to take into account the non-availability during the actual failover of your application in the event of hardware or software failures. Sometimes, failover is a matter of seconds, in other cases it can be several minutes. In all cases, a clustering solution will significantly drive down non-availability and increase the uptime of your application as run on your servers. Many experts state that, for any application or system to be highly available, the parts need to be designed around availability and the
individual parts need to be tested before being put into production. As an example, if you are using 3rd party products with your Exchange environment that have not been properly tested, you may find that they are a weak link that results in loss of availability. Implementing a cluster will not necessarily result in HA if there are problems with the software.


I could and maybe should ramble on some more, but I need to focus on some other things right now. To summarize this entire discussion:


HA is so much more than just slapping a couple of servers together in a cluster. Please keep in mind all of the details behind a top-notch HA environment.

Heartbeat Network – Original Posted Feb 10, 2005

This is pet peeve of mine. I have run into several instances where clusters have been installed and the heartbeat network is not properly configured. The heartbeat network is a private network shared just by the nodes of a cluster and is not accessible to other systems. It is not supposed to be routable at all. When it is built, you should select the option for “Internal cluster communications only (private network)” for the private heartbeat network. Selecting anything else can be a problem since this network is not routable and can’t connect to other networks. Make sure that you use IP addresses for your heartbeat network that do not exist anywhere else in your network or on the Internet.


You basically have a few choices when it comes to building the private heartbeat network:



  1. Use a cross over cable between the nodes (only valid for two node clusters)

  2. Use a VLAN on a switch

  3. Use a dedicated switch

  4. Use a dedicated hub

I highly recommend using a dedicated hub over a switch for a couple of very good reasons.



  • Power problems can cause the loss of the switch configuration. I have seen this with static shocks to switches as well as power spikes and brown outs.

  • Network admins often seem to have problems remembering to save the configurations to NVRAM. Anytime the power is cycled the info is lost.

  • Hubs don’t have configurations that can be lost or corrupted. They either work or they are broken. The only problemt that I have ever seen with a hub is the power supply failing, and that same risk exists with a switch.

  • Hubs are incredibly cheap and available at your favorite street side vendor along with a hot dog with all the fixings. You can also find spare power supplies for your hub in your junk drawer in many cases.

Some other steps that you should take with the heartbeat network is configuring the network connection in the operating system to remove the “Client for Microsoft Networks” and the “File and Printer Sharing for Microsoft Networks” on the General tab for the connection properties. You should also go into the “Internet Protocol TCI/IP” properties and in the advanced properties on the DNS tab you should unselect “Register this connection’s addresses in DNS” and on the WINS tab select the radio button for “Disable NetBIOS over TCP/IP” since the private network does not need them.


 


Even after configuring the private heartbeat network, don’t forget to also configure the public network connections so that they are set for “All communications (mixed network)” so that the public network connections can be used to run the heartbeat if the private network fails for some reason.

Multi-homed Domain Controllers and Dell Servers – Original Posted Feb 9, 2005

I ran into an issue a few weeks ago that I thought I would share. One of our servers kept running into authentication problems, and after digging around, we found that one of the domain controllers (the only one for the site) was registering three IP addresses in DNS. One IP address was for its production network NIC, one IP address was for the private backup network, and the third was some strange PPP connection. Netmon traces showed that the server was sending authentication traffic to the private backup network NIC.


Odd, I thought. I double-checked and the private backup network NIC was configured to not register itself in DNS and NetBIOS was disabled. Quick research revealed that this is expected behavior. Domain controllers do not honor the “register this connection in DNS” checkbox. DCs register it whether we want them to or not.


Some research revealed that there are three ways to resolve this issue:



  1. Turn off dynamic updates and configure all AD records manually (ummmm, NO!)

  2. Install a hotfix re: Article 832478

  3. Disable or remove the offending NIC

I went with option 3 as I could just reconfigure the backups to run across the production network. In reality, I could have skipped the backups completely as it is just a DC/GC, and I have more of those.


The real problem was this odd PPP connection that showed up whenever I ran IPConfig. It would show an IP address of 192.168.234.235. At first, I thought somebody must have configured RRAS. I checked that and found it was not true. Research showed that lots of my new servers had this same connection configured.  Using a little logic, I figured out it was all of the newer Dell servers. I pinged a couple of friends, and Rick Taylor had the answer. He said, “Russ, you fool! It is the DRAC!” OK, he didn’t really say it that way, but that is the way I felt. So after Rick pointed me (held my hand) in the right direction, I found the solution. Actually, there are a couple of solutions.



  1. Turn off dynamic updates and configure all AD records manually (ummmm, NO!)

  2. Install a hotfix re: Article 832478 and then open the racdun.pbk file and uncheck the “register this connection in DNS” checkbox

  3. Use the radadm utility from Dell and remove the IP address completely

  4. Disable or remove the offending DRAC by opening Computer Management, expanding the Modems section and right-clicking the offending DRAC interface

Of course, if you disable the DRAC, then you can’t use it. I thought I better point that out to the common-sense challenged so I don’t get nasty emails. <G>

MSDTC Issue Part 2 – Original Posted Feb 1, 2005

I got brave and decided to fix the problem. The issue, for those who missed the first show, was that I needed to reconfigure the MSDTC on a SQL cluster. The MSDTC needed to be moved to a different drive because we are retiring one of the EMC frames.


The more I read on this the more complex it seemed. I read through Q 301600 and Q 294209, and reviewed several other sources. These articles made it sound like I was going to have to rip out DTC on each node and then rebuild it on each node and restart SQL on the cluster. I just refused to believe it was so complex.


The MSDTC resource was configured as part of the initial cluster group with the cluster name and Q: as dependencies. Previously, the quorum was moved from Q: to I:, but the old Q: could not be removed until the MSDTC was reconfigured.


So, the more that I thought about it, the more sense it made to me, so I:



  1. Stopped the MSDTC resource

  2. Copied the MSDTC folder from Q: to I:

  3. Stopped the Q: resource

  4. Deleted the MSDTC resource

  5. Created a new MSDTC resource with the clustername and the new I: as dependencies

  6. Brought the MSDTC resource online

I don’t see any problems with it so far.

IIS 6.0 Security – Original Posted Jan 28, 2005

Wow, there is a great deal of confusion on this subject. I asked a few people what they thought this topic is in their minds. I heard several differing views regarding what it means to secure IIS 6.0. So, what is it? Is it securing the server? Is it securing the service? Is it securing the application or site?


I tend to lean towards the definition including securing the application or site more than anything else. The goal is to make sure the website and any applications available through the website is available to users. Now, that goal does include securing the server and securing the service, but if you include the website content/applications then you are adding another level to the issue.


So, we secure the server doing such fun tasks as turning off unused services and basically locking down the operating system. We put the server in a well protected DMZ. We can also perform such tasks as enabling IP filtering and configuring filters on the firewall(s) to help protect the server from unauthorized port access. We can turn off ICMP ping responses to make the server and its IP address a black hole to script kiddies. We should install antivirus software and anti spyware software. There are so many things we can do and should do.


Some tasks that I am not hearing when it comes to securing IIS 6.0 include using tools to republish the site on a regular basis and moving the actual content to servers inside the LAN. If your site is defaced by some incredibly industrious hacker, you can write right back over it with your approved content using several different applications or home grown scripts. The hacker gets the joy of defacing your site for a few minutes and *poof* it is right back to the way it should be in a matter of moments. they can’t even brag to their friends that they did it because it is back to normal so quickly. One of my favorite methods of securing content and applications is to have the actual content and the application data inside the LAN. The server can sit in the DMZ, but we can use the features of IIS to redirect requests for content and data back through the inside firewall to internal servers. Even if the IIS server is somehow compromised, they still don’t have access to the data in many cases.


Security really isn’t that difficult to implement. I think the key is to keep the basic security concepts in mind when designing your IIS 6.0 solution. Don’t allow more access than is required to view the content or run the applications. Don’t allow developers any access to the production box. After all, they are supposed to develop in a development environment, test in a test environment and then turn it over to the systems engineer to deploy the final solution in a production environment. Keep in mind the many different levels of security available to you. Watch the site constantly (or monitor it using good products) and be prepared to repair as necessary. Work closely with the others involved such as the network team and the end users to make sure we do everything we can to keep the solution secure.


By the way, I didn’t even talk about SSL yet.


Stay tuned, there is more to follow on this subject as I flesh it out. I need to do this soon as I am supposed to present a session on IIS 6.0 Security at TechMentor in Orlando this April.

MSDTC Issue – Original Posted Jan 26, 2005

More like, it is semi-broken. Ok, it isn’t broken at all, but it isn’t configured the way I want it configured so I have a desire to fix it.


It all started last Friday night. We are migrating from one EMC frame to another one. The storage guys (they really do a great job here) added new LUNS to the SQL cluster. We then added the drives as resources in cluster admin and then put the drive resources into the proper SQL cluster groups. We shut down the SQL services, did a quick file copy to the new drives, changed the drive letters to match the old drives and restarted the SQL services. We also pointed the quorum to its new drive. Everything works.


Problem. The quorum is now the I: and not the Q: Yes, I know it is a stupid issue, but the company wants all quorums to be Q:’s. So, we try to make the change. Nope, the old Q: won’t let us change it. Why? Because of the MSDTC is using that resource. Time to go to bed.


Topday, we revisit the issue of the Q:. We have to do something because this one drive is holding up the retirement of the old EMC frame. The MSDTC is in the cluster group along with the quorum and the cluster IP and cluster name. So, there are two physical disks, one being the Q:(old quorum) and the other being the I:(new quorum) in the cluster group. We try stopping the MSDTC, copying all the msdtc folder info from Q: to I:, adding the I: as a dependency, and removing the Q: as a dependency. As you can guess, this just doesn’t work. Yes, I know the solution is to uninstall MSDTC and reinstall it. No, I don’t want to do that. I want a better way.


Back to thinking… I will probably just do it the right way, but I have a nagging feeling I am missing something really easy.

Moving a Cluster to a New SAN – Original Posted Jul 22, 2005

A fairly common scenario for a cluster administrator is to move a cluster from one SAN to another as SAN equipment is replaced with newer/faster SANs or the old SAN’s lease is up and a new one is being brought in.

 

The easiest way that I have found to do this is to use these steps (this is from memory, let me know if I missed one or two):

 

Super High Level Steps:


  1. Put the new array in the same fabric as the existing array
  2. Create new LUNs on the new array and make sure they are visible to the nodes
  3. Map the new LUNs to the old drive letters
  4. Copy data from the old drive to the new drive
  5. Move quorum and MSDTC

Slightly More Detailed Steps:


  1. Carve the new LUNs on the new array
  2. Add the new array and its LUNs to the same switch as the existing array
  3. Configure the LUN masking on the switch to expose the new LUNs to NodeA and NodeB
  4. Use the disk management tools in Windows to rescan the drives
  5. Use the active node to partition and format the disks
  6. Use Cluster Administrtor to create the new physical disk resources and put them into their proper cluster groups
  7. Move the Quorum using the GUI to a temp location

    1. In Cluster Administrator, right click the cluster name
    2. Select Properties
    3. Select the Quorum tab
    4. Use the drop down box to select a temp location for the quorum

  8. Delete the existing MSDTC folder (if any)

    1. Stop the MSDTC resource
    2. Copy the MSDTC folder from Q: to the final qurom disk target location
    3. Stop the Q: resource (remember, the quorum isn’t there anymore)
    4. Delete the MSDTC resource

  9. Move the quorum to its final location

    1. Go into disk management and change the Q: name to another letter
    2. Use disk management and name the final quorum drive to Q:
    3. Repeat steps 7.1-7.4 to move the quorum to its final destination

  10. Recreate the MSDTC resource

    1. Create a new MSDTC resource with the clustername network name resource and the new Q: as dependencies
    2. Bring the MSDTC resource online

  11. Stop the cluster service and the application cluster groups (you can just stop the application resources if you want to move app data an app at a time)
  12. Move the data from the old disks to the new ones
  13. Re-letter the old disks to something outside the current range, but do not remove them yet – you might need to use them in your back out plan
  14. Re-letter the new disks to the same drive letter as the old ones (no, you do not have to worry about disk signatures as applications don’t understand disk signatures and don’t care about anything other than drive letters)
  15. Verify that all dependent resources are pointing to the proper physical disk resource.
  16. Restart the cluster service
  17. Make sure the new drive letters and disk resources are showing up properly in cluster administrator
  18. Bring everything back online

Again, these are basic steps. Some of the individual steps will require lots of work. I have done this now several times and am very happy with the results.

Common Exchange 2003 Cluster Questions (Top 11 List) – Original Posted May 1, 2005

Whenever I teach Exchange Server 2003 classes, I get to the module that discusses clustering and I want to scream. There just isn’t enough material to discuss Exchange clustering properly. Anyways, I started talking more and more about clustering as there seems to be a great deal of interest in clustering Exchange in many organizations. So, here are some of the more common questions I get when discussing Exchange Server Clustering.


 


Q1. If I have two nodes in the cluster, do the mailboxes exist on both nodes?


A1. Microsoft Server Clustering uses a shared nothing architecture. In this architecture, resources are created for a virtual server (they include any needed Physical Disk resources, Network Name, IP Address, and services). In the case of Exchange, the cluster virtual server is built and all the resources run on the active node. If the virtual server fails over or is moved to the passive node, the second node in the cluster then takes control of all of those resources. So, short answer: The mailboxes exist in the storage group associated with the physical disk resource and this disk resource is passed back and forth between the nodes. Only one copy of each mailbox exists.


 


Q2. If I build a two node cluster, do the computers have to be exactly the same?


A2. No, they don’t need to be exactly the same, but they need to be very close in order to be supported. See KB 814607 and read the section on Server Cluster Qualification for more information.


 


Q3. I read the book and I also heard you say that you will often need additional single machine Exchange servers when using Exchange Server Clusters. Why do I need to have Exchange servers that are not part of a cluster?


A3. Several different services are not properly supported in a cluster and others just simply do not work. These services include:



  • Active Directory Connector
  • Intelligent Message Filter
  • Site Replication Service
  • Internet Mail Wizard
  • /DisasterRecovery setup switch
  • Lotus Notes Connector
  • Novell GroupWise Connector
  • Exchange Events

To top it off, because of the SRS and ADC issues, an Exchange Server 2003 cluster can’t be the first Exchange Server 2003 server in an Exchange 5.5 site. Thanks to David Elfassy for helping me with this list. http://spaces.msn.com/members/elfassy/Blog/cns!1pvwhiXzZoTl_cUJCU1PSHfw!185.entry


 


Q4. MSDTC is required as part of the cluster install and there are conflicting articles on the Microsoft site about whether it needs its own cluster group with its own IP resource, network name resource, and physical disk resource. What is the right answer?


A4. MSDTC does not require its own physical disk resource and it can be included in the default cluster group. You can get more info on my blog under the Micrososft Clustering category.


 


Q5. What is wrong with using Active/Active for Exchange clustering vs. Active/Passive?


A5. Auggghhh. Read my blog here for the answer(Summary – Don’t use Active/Active): http://spaces.msn.com/members/russkaufmann/Blog/cns!1pwuGkyvTDx37q1_Y3JQ_E6g!137.entry


 


Q6. How do I add the IMAP4 and POP3 services to my Exchange cluster after it is installed?


A6. It is covered here: http://www.microsoft.com/technet/prodtechnol/exchange/guides/E2k3AdminGuide/47c09fa5-09cc-4fe6-a748-d45f0d3b5ded.mspx but to boil it down to the basics, the steps (shown for IMAP4 only) are:



  1. In the Cluster Group for the Exchange Virtual Server (EVS), right click it and select New, Resource, then enter the name (i.e. EVS1 IMAP4)
  2. Select Microsoft Exchange IMAP4 Server Instance from the Resource Type list.
  3. Add all nodes as possible owners
  4. Add the System Attendent as a dependency

 


Q7. Why do I need MSDTC to be installed in order to build an Exchange cluster?


A7. Because. It is really only needed during the installation of the cluster because the Exchange install application needs the cdowfevt.dll that is part of the Com+ installation. MSDTC is used for workflow applications in Exchange, but other than that, it isn’t used at all after the install. Oops, I take it back, it is used for upgrading as well.


 


Q8. How many physical disk resources should I plan for an Exchange cluster?


A8. At a minimum, you should have 4 physical disk resources per EVS, 5 if you the MTA is heavily used as it should have its own physical disk resource.



  • One for the quorum and MSDTC (yes, you can put the MSDTC on the same disk without any trouble)
  • One for the store (one for each storage group at a minimum, I prefer one for each store)
  • One for the transaction logs (on for each storage group)
  • One for SMTP
  • One for the MTA (possibly… it depends how much the MTA will be used in your environment)

Keep in mind that each of these disks should be a LUN on a SAN. If you are carving them up yourself, I highly recommend using RAID 1 sets for the transaction logs, SMTP, and MTA (if you use it heavily) and RAID 5 for the mailbox stores. Do not create physical disk resources that are partitions on the same physical drives. When it comes to disk sizing, I highly recommend reading Nicole Allen’s blog entry at http://blogs.technet.com/exchange/archive/2004/10/11/240868.aspx. She does a fantastic job of explaining how to size disks for Exchange. You can also see similar information on storage optimization at http://www.microsoft.com/technet/prodtechnol/exchange/2003/library/optimizestorage.mspx.


 


Q9. Why do you recommend MSCS for the mailbox servers but not for the OWA servers?


A9. The OWA (also known as the Front End or FE) servers do not have a requirement for shared disk storage. You can achieve server redundancy and horizontal scaling using NLB or hardware load balancers with multiple FE servers since there is no requirement for a database or information stores on an FE.


 


Q10. Windows Server 2003, Enterprise Edition, support eight nodes in a cluster. Can I have eight virtual Exchange servers?


A10. While you can have up to eight nodes in a cluster, you can’t have that many Exchange Virtual Servers in a single cluster. Once you go to three or more nodes, Exchange forces you to have at least one passive node. So, for eight nodes, it is possible to have only up to seven active nodes and one passive node. There are a couple of concerns that you need to be aware of when creating larger than two node Exchange clusters.



  • If you have three or more nodes, each node can only host a single EVS. If, for example, you have three nodes with two active nodes and one passive node(Active/Active/Passive), and one of the active nodes failes, it will failover to the one passive node. If the other EVS failed, then the EVS would not failover to another node. It would just fail. While you can potentially have two EVSs on the same node in an Active/Active two node cluster, you can’t have two EVSs on the same node in larger clusters.
  • In a large cluster, it makes sense to have two or more passive nodes so that you can support more than one failure at a time.
  • My personal recommendation is to never go beyond 4 nodes (Active/Active/Active/Passive) as you will be fighting disk letter issues (think about 5 or more physical disk resources per EVS and then do the math), and it would become very complicated to monitor and manage. With three EVSs, the number of disk drive letters gets to be pretty high and will make it difficult to add new physical disk resources and do to things like disk migrations in the future. Yes, you can use mount points, but using disk letters makes it easier to manage.

 


Q11. How many IP address do I need for an Exchange cluster?


A11. You need IP address for:



  • NodeA Public Interface
  • NodeA Private Interface (for heart beat)
  • NodeB Public Interface
  • NodeB Private Interface (for heart beat)
  • Default Cluster Group (IP is needed as dependency for network name resource)
  • Exchange Virtual Server (IP is needed as dependency for network name resource for your EVS). You will need one IP for each EVS in your cluster.
  • MSDTC cluster group (if you break it out into its own cluster group, it will need an IP resource, but thisis  not needed)

Remember a couple of important things regarding your heart beat networks:



Note: I will return and update this entry as I think of the more common questions that I get in my Exchange classes.