Connection balancing across NLB using IIS and MaxKeepAliveRequests
I have been doing a lot of work lately with Network Load Balancer (NLB) which is Microsoft’s clustering solution and Microsoft Internet Information Services (IIS).
We have written a video transcoding application which sits under a RESTful front end provided by IIS. The transcoding application is CPU bound, that is, the CPU is the first place to bottleneck and prevent the computer from doing more work. The heavy CPU is caused by video transcoding. This involves reading a unit of video from a video server, converting it to another format and squirting it out to a client. Transcoding video is a pipeline process which means there are huge performance advantages in processing a series of consecutive video units in a read-ahead fashion.
A normal web server could handle 2 or 3 orders of magnitude more requests than ours. As a result we found that it was more difficult to load balance across an NLB cluster because the number of new incoming connections was relatively small.
The application suite has been designed to be stateless in order to allow it to fit into a cluster architecture. We want to be able to scale outward more easily so in order to support more clients we can just add more boxes.
Our experiments have shown that 1 PC can support about 10 simultaneous clients before the system’s performance degrades to unusable levels. For each new PC we add to the cluster, we can get another 8-10 clients.
We would like to keep each client talking to the same cluster node for a short period so that we can get the benefit of pipe-lining requests, while at the same time we need to make sure that clients can move between cluster nodes in order to keep the load evenly balanced across the cluster.
There are several configuration options across NLB, IIS and some custom code that needs to be configured in order to build a suitable solution.
Under IIS, HTTP KeepAlive allows a client to connect once, then make as many requests down the connection pipe as it likes before the client closes the connection. The server will hang on to each client until they go away. If KeepAlive is switched off the connection will be closed at the end of each request which may add significant overheads to dealing with clients that a geologically distant. HTTP KeepAlive works on layer 5 of the OSI model.
NLB has a similar option called Affinity. The Affinity can be either sticky or non-sticky (there are other states but for the purposes of this article they can all be condensed into these two). Stickiness ensures that the same client is always directed to the same cluster node. NLB works on layer 4 of the OSI model.
The simplest solution is to switch NLB Affinity to non-sticky and set HTTP KeepAlive to false. Each incoming request that arrives at the cluster will be directed to a choice of machines, make its request, get the data and then tear down everything and start again for the next request. With this set up we will not be able to take any advantage of the pipe-lining efficiency that could be had and as a result the platform will be able to support fewer clients overall.
Each one of these technologies has advantages and disadvantages. The advantage of using stickiness with NLB is that you can ensure that all requests for a client, for the lifetime of the client or that cluster node will be directed to the same place. That will be good for pipe-lining but bad for load balancing. The advantages and disadvantages for HTTP KeepAlive are similar except here you are at the mercy of what the client decides to do.
In experiments we have shown that if one of the nodes in the cluster goes down the NLB will notice and rebalance; diverting incoming traffic to another node in the cluster. The HTTP KeepAlive clients will simply reconnect to the next allocated node in the cluster and stay there for the rest of their lives. This means that when a downed node comes back up, it balances with the rest of the cluster to make sure the request distribution is correct. NLB will not sever existing connections so all the existing clients will stay where they are. Only new incoming connections will be allocated to the newly added cluster node. So what we find is that after a cluster node failure the rest of the nodes take up the slack and end up working extra hard, but when the failed node re-enters the cluster it sits there doing nothing.
If you were dealing with thousands of small requests it would be a different story; it probably wouldn’t matter so much because new clients are coming and going all the time.
What we need is a combination of KeepAlive and not KeepAlive on a non-sticky platform. Apache has a configuration option called MaxKeepAliveRequests. This option severs the connection to the client after this many requests (the default is 100). With this option we can have 100 consecutive requests over the same connection to enjoy the benefits of pipe-lining the requests and yet we are giving the system/platform a chance to balance itself on a regular basis.
IIS has no concept of limiting the number of requests a connection can service, which probably goes some way to explaining why IIS only has 15.73% of the web server market. I posted a question on ServerFault but didn’t get a satisfactory response. The one reply I did get was from some one saying that if my application was truly stateless I needed to switch off KeepAlive altogether and take the penalty for the re-connection. While the application is stateless there are advantages to be had from batching requests together. An answer of it can’t be done or is not supported is, in my opinion not an answer. What they actually mean is that it is not supported yet. In I.T. almost everything *is* possible as long as you know what to do.
IIS7 has a new pipeline module architecture that allows you to inject code into the processing of a request at any one of about 12 different stages. The run line passes through each module at each requested stage in order to modify the request’s response.
When the module is loaded in, it reads the MaxKeepAliveRequests number from the
web.config
. For each request that comes in the module will remember the remote host, remote port and how many requests have been serviced by that combination. When the request is in its final stage we’ll check to see if the number of serviced requests is bigger than MaxKeepAliveRequests. If it is then we can inject a Connection: close into the response. This will make its way through IIS, safely closing the connection on it’s way out.
Surprisingly there was a great deal of confusion on MSDN documentation, blogs and forums surrounding how to force a close after a request. I found that HttpResponse.Close()
can chop the end off the reply, HttpApplication.CompleteRequest()
didn’t work because the request’s run line was already inside the EndRequest
section of the pipeline. So I went back to the specification and in RFC2616: Section 8 - Connections it talks about injecting Connection: close into the response header so that after the response is sent out the server closes the connection. The closure forces the client to reconnect. I tried this using a telnet client (and not a web browser) and can reveal that it is the server that closes the connection and not the client deciding.
I had thought about using the Session to store the request count but I didn’t think it would help. If a proxy server is talking to your cluster then it may be interleaving requests from several sources with different session identifiers. We are interested in the transport layer, and not the session layer. We must use values from the transport layer to differentiate the clients in order to spread the load.
Simply compile up this C# and add it to your IIS integrated process pipe line.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using System.Diagnostics;
using System.Collections.Specialized;
using System.IO;
namespace WebApplication1
{
public class MaxKeepAliveRequestsModule : IHttpModule
{
int maxRequests = 0;
Dictionary<string, KeepAliveClient> record = new Dictionary<string, KeepAliveClient>();
public MaxKeepAliveRequestsModule()
{
Debug.WriteLine("Debug : MaxKeepAliveRequestsModule.con");
}
public int MaxRequests
{
get { return maxRequests; }
set { maxRequests = value; }
}
public void Init(HttpApplication context)
{
Debug.WriteLine("Debug : creating MaxKeepAliveRequestsModule");
string mrStr = System.Web.Configuration.WebConfigurationManager.AppSettings["MaxKeepAliveRequests"];
maxRequests = validateMaxKeepAliveRequestsValue(mrStr);
context.EndRequest += new EventHandler(OnEndRequest);
}
private int validateMaxKeepAliveRequestsValue(string val)
{
if (val == null || val.Length == 0)
throw new ArgumentException("appSettings.MaxKeepAliveRequests is empty");
int mr = Convert.ToInt32(val);
if (mr < 1)
throw new ArgumentException("appSettings.MaxKeepAliveRequests must be greater than zero: " + mr);
return mr;
}
public void Dispose()
{
Debug.WriteLine("Debug : MaxKeepAliveRequestsModule.Dispose");
}
public void OnEndRequest(Object source, EventArgs e)
{
Debug.WriteLine("Debug : MaxKeepAliveRequestsModule.OnEndRequest");
HttpApplication app = (HttpApplication) source;
HttpRequest request = app.Context.Request;
HttpResponse response = app.Context.Response;
// Tried to use socket as the key, but don't seem to back access to it from here
// Stream k = response.OutputStream;
NameValueCollection serverVariables = request.ServerVariables;
string k = serverVariables["REMOTE_HOST"] + ":" + serverVariables["REMOTE_PORT"];
if (record.ContainsKey(k))
{
KeepAliveClient c = record[k];
Debug.WriteLine("Debug : MaxKeepAliveRequestsModule.OnEndRequest: hit");
if (c.Hits > maxRequests)
{
Debug.WriteLine("Debug : MaxKeepAliveRequestsModule.OnEndRequest:max requests reached for " + k + "(" + c.Hits + "), force close connection to client");
// works, but may chop the end of the response
// response.Close();
// doesn't appear to work
// app.CompleteRequest();
// http://www.w3.org/Protocols/rfc2616/rfc2616-sec8.html
response.Headers["Connection"] = "close";
record.Remove(k);
return;
}
c.touch();
}
else
{
Debug.WriteLine("Debug : MaxKeepAliveRequestsModule.OnEndRequest: miss");
cleanOldKeepAliveRecords();
record.Add(k, new KeepAliveClient(k));
}
}
private void cleanOldKeepAliveRecords()
{
foreach (KeepAliveClient cc in record.Values.ToList())
{
if (cc.isExpired())
{
Debug.WriteLine("Debug : MaxKeepAliveRequestsModule.cleanOldKeepAliveRecords: key=" + cc.Key);
record.Remove(cc.Key);
}
}
}
}
class KeepAliveClient
{
private static TimeSpan TIMEOUT = new TimeSpan(1, 0, 0); // hour
private DateTime now;
private int hits;
private string key;
public KeepAliveClient(string key)
{
this.key = key;
now = DateTime.Now;
hits = 1;
}
public int Hits
{
get { return hits; }
}
public string Key
{
get { return key; }
}
public void touch()
{
hits++;
now = DateTime.Now;
}
public bool isExpired()
{
return now + TIMEOUT < DateTime.Now;
}
}
}
You’ll need to add the configuration option to the web.config
<configuration>
<appSettings>
<add key="MaxKeepAliveRequests" value="100"/>
</appSettings>
</configuration>
1 comment
Comment from: Mou [Visitor]
This is a great post. I was researching on the same thing and you saved a lot of time for me! thank you very much.
Form is loading...