Maximum size of an ISA Server NLB Cluster


This is a long and technical post about why Microsoft recommends that there be no more than 8 nodes in an ISA NLB cluster. There are some nice pictures of graphs and a summary at the bottom if you want to skim through it :)

8, 12 or 32 nodes?
There is some confusion as to how many nodes can be in an ISA NLB cluster. Digging around on the internet doesn't really solve the problem:

"... However, it requires some CPU processing overhead (about 10 to 15 percent for common ISA Server scenarios), and has a limit to the number of members in the cluster. (About 12 machines is the recommended maximum.) ..."
(http://www.microsoft.com/technet/archive/security/prodtech/isa/isaprfbp.mspx?mfr=true)

"... However, it requires CPU processing overhead (approximately 10 to 15 percent for common ISA Server scenarios), and has a limit to the number of members in the cluster (approximately 8 computers as the recommended maximum). ..."
(http://www.microsoft.com/technet/isa/2006/perf_bp.mspx)

"... Cluster size, defined as the number of cluster hosts participating in the cluster (up to 32), is based on the number of computers required to meet the anticipated client load for a given application. ..."
(http://technet2.microsoft.com/windowsserver/en/library/750d3a40-af67-411d-828b-fc1f718a06fb1033.mspx?mfr=true)

"... Network Load Balancing clusters provide scalability and high availability for TCP- and UDP-based services and applications by combining up to 32 servers running Windows Server 2003 ..."
(http://technet2.microsoft.com/windowsserver/en/library/358b9815-3cd3-4912-a75a-cae85ea8d5ab1033.mspx?mfr=true)

It is a common believe (myth?) that ISA performance will degrade drastically when more than 8 nodes are in a single cluster.

But why is this important ?

Our university has about 7000 employees who will all be using Microsoft Exchange within a year (...). Using the ISA capacity planner, we calculated that we would need close to 8 ISA servers in a cluster to accomodate this amount of users. And that's where the Microsoft recommendation comes into play.

Because 8 ISA servers in an NLB cluster is supposedly bad for performance, 2 solutions were offered. First, we could place a hardware loadbalancer like F5 or Alteon which (again supposedly) costs a fortune. Second, we could increase the complexity of our setup and pull some traffic from the ISA cluster by allowing direct Outlook Anywhere connections from clients that are inside the domain, using a combination of fingercrossinng and something we have affectionately called "Silly DNS".

Of course, I prefer neither solution because I believe an ISA NLB cluster can work perfectly with more than 8 nodes. I have spent the better part of today trying to dig up information on this circus and will now attempt to "prove" that ISA can do much better than what is generally believed. Of course, the best test would be a setup with 32 physical ISA servers in a single cluster. But I don't own that much hardware, not to mention the cost of the electric bill...

Allright, let's first look at how NLB works and why I doubt it would have a performance impact on ISA.

How NLB works
NLB is short for Network Load Balancing. It has 2 variants: unicast and multicast. I will only consider unicast because that's the simplest.

When N servers are in a unicast NLB cluster, they all listen for the same IP address and MAC address on the same network. So when a router transmits something to that IP address, all N cluster nodes receive it. Now instead of all of the nodes handling the incoming packet in parallel, only 1 should do so. A single node is "selected" to handle the packet and all other nodes drop the received packet.

So that's basically how it works. N nodes all receive the same traffic, but only 1 of them actually handles a specific incoming packet. Instead of a central instance telling a node that it should handle that packet, the nodes actually know themselves by using a fully distributed hashing algorithm. The way I understand this is that each node has a unique ID (from 0 to N-1). When a packet comes in, they all perform a hashing calculation (like modulo N) on a combination of source IP and source port (depending on the affinity setting). If the result is the ID of the node, then that node handles the packet.

Such an algorithm would work great if the number of nodes were fixed and none of them ever failed. But this is planet Earth and failure is always a possibility.

In case a node fails, is removed or added to the cluster, a process called NLB cluster convergence is started. Conceptually, this means that all existing nodes yell "I'm alive" on the network, everyone counts the number of active nodes and "gets assigned" a unique ID number. When that's done, the cluster is said to have converged and can go on handling traffic.

To detect that a node has failed, "Alive" messages are sent over the network at a certain interval. If a certain node doesn't send an alive message for some amount of time, a convergence process is started.

There is some information about NLB registry settings, among which are AliveMsgPeriod and AliveMsgTolerance.

The default settings are 1000 (1 second) for AliveMsgPeriod and 5 for AliveMsgTolerance. This means that all nodes broadcast alive messages every second, and a node is considered dead after having failed to transmit 5 such messages (so 5 seconds).

More information needed ? How Network Load Balancing Technology Works

Why NLB shouldn't affect ISA performance

Let's analyze this algorithm and see if it can affect ISA performance in any significant way.

I'm assuming a setup where each ISA server has a separate networkcard for sending Alive messages (hereby called the heartbeat network). This is exactly the setup we use: 1 External, 1 Internal and 1 HB network interface.

Furthermore, I assume that all interfaces are gigabit and connected to separate switches (1 for External, 1 for Internal and 1 for HB)

Right, let's first go for default behaviour.

If there are 10 NLB nodes, all 10 will receive the same data. If that saturates the link attached to the External NIC, then the total throughput to the ISA cluster is 1 gigabit, independent of the amount of NLB nodes in the cluster (assuming the switch has a well enough switching fabric)

Similarly, on the Internal side, all links can be saturated at once. The bottleneck will then be either the switch or the backend, but not the ISA cluster.

Now for the special behaviour: the alive messages.

Every second, every node sends a broadcast message to indicate that it is still alive.
For some odd reason, they are all 1510 bytes big, no matter what the content is. Sending these messages to the HB switch can happen all at once, but then the switch needs to serialize all those messages to broadcast them. Every second, the HB switch has N messages of 1510 bytes long that it needs to send over each link.
In case of 32 NLB nodes, that would be about 48KB/s. Clearly this is within the capability of a gigabit link. In fact, to fully saturate a gigabit link with alive messages, there would need to be more than 650000 nodes in the NLB cluster.

So, which part of this entire construction is having an impact on the ISA performance ? Whether there is only 1 node or 500 nodes in the cluster, conceptually it would make no real difference.

I forgot to mention 2 things: NLB clusters are limited to 32 hosts for some reason. And using NLB on a node has a 15% performance penalty on the node, even if it is the only node in the cluster. I'm stunned :) I really don't understand how this can happen.

But then, I don't know if this is exactly how NLB operates. It's just the way I would have designed it.

What does the capacity planner say ?

Microsoft released some performance best practices that were turned into a capacity planner by Thomas Shinder. The bad news is that the capacity planner there is a flash program. The good nenws is that there is also a spreadsheet version of the same program.

This flash version of the capacity planner was used to determine that we will need close to 8 servers if all traffic goes through the ISA cluster. I don't doubt that.

What I was curious about, is how this tool calculates its result. The page with the performance best practices is a good and interesting read. The most relevant part of the data can be found at the bottom of that page:



You'll notice that the capacity planner can't handle more than 8 servers because Microsoft did not specify the data for more than 8 NLB nodes in this document. This causes me to believe that it is a myth that performance will degrade drastically if there are more than 8 nodes.

Where did Microsoft come up with these numbers ? I don't know. Let's have a look:


The good observer will notice that these numbers are slightly bent downwards like a logarithmic curve. Using a nonlinear regression tool I was able to calculate that the scalefactors can be calculated as:

scalefactor = n^0.1927


Where n is the amount of nodes.

This results in the following data:


I then adapted the capacity planner spreadsheet and added those scalefactors. You can download it here:

Download http://data.singularity.be/Trashcan/ISA_Array_Sizing_Spreadsheet_on_steroids.xls


The spreadsheet shows that the speed of each CPU of each node keeps decreasing as the cluster gets larger (which was to be expected) and that there is no special "butter zone" around 8 nodes per cluster.

Summary
Microsoft recommends that there not be more than 8 nodes per ISA NLB cluster, while an NLB cluster can technically have up to 32 nodes in it. I showed that the NLB algorithm itself does not limit the amount of nodes in the cluster. Furthermore, I believe that the numbers from the performance best practices can be extrapolated for up to 32 nodes and I updated the capacity planner spreadsheet accordingly.

Maybe there is some good reason for keeping the size of an NLB cluster limited to 8, but I have not found it. If Microsofts recommendation of maximum 8 nodes is all that stands in the way to save several 10s of thousands of euros or a simpler setup, then I will gladly dismiss it.
If there is a good reason however, I will want to hear it. I'll see if I can get hold of someone with a good answer.