Optimal Management of Keepalives


Using my new optimized template, I see that there are 14 http requests, spread accross 4 different servers (2 of these requests are to google for adsense). What I've done to this point is a mild attempt at ordering the requests in such a manner that they occur linearly in a manner that optimizes page display characteristics. Because of the integration of adsense into the template, a full data display will not occur until after adsense is loaded, which inherently means that anything nested in the head, save pre-fetch requests, will occur prior to full data display.

Ignoring TCP Handshakes, http overhead, and rendering time, I see the following results: Total page download time, 14.28 seconds. Displaying some data: .82 seconds. Displaying all data on the page: 4.39 seconds. I also see that having an animated favicon, which is really only visible for firefox users ads close to 8K to the overall page view, and that file itself must be referenced in the HEAD, meaning it affects the time for All Data display. Removing that reference yields a 12.5 second total page load time and a 2.64 second All Data display time. Removing adsense or placing it in an iframe would decrease the all data display time to the Any Data level of .83 seconds, so having adsense in the page incurs an implicit performance degradation for dialup users. Of course, if they have the show_ads.js file cached, and they probably do, the timing reduces to .83 seconds. It's not all google - I have an approximately 5K image placed prior to the adsense code that must be downloaded first. Taking that out of the equation, The cost of adsense is really an additional .51 seconds in practical terms. Of course this assumes that google can deliver their files optimally, which is not necessarily always the case.

When I stop ignoring TCP Handshakes and http overhead, however, I find something a little bit unnerving. Considering the average latency of a geographically disparate visitor of 75 miliseconds for a one-way communication transmission, 3 seconds of overhead is added to the overall download speed with keepalives turned off. Turning keepalives on could hurt the server's potential for being able to drive high traffic loads, so in order to optimally configure keepalives, it's time to look at the number of requests going through each web server.

The www server gets only a single request, so keepalives should be turned off there. The st2 server receives 9 requests for 18459 bytes of data. Considering linearization, I would set the keepalive timeout to the maximum expected page download time of for 33.6K users, and the keepalive max to 9. Considering http pipelining, however, I would expect two concurrent requests serving 4 and 5 files each, so the simple optimization would be to set keepalive max to 5. There is further optimization to be made, though, in considering when and where files will be downloaded in the process of downloading the page. There are three images directly referenced in the main page, so I would expect two simultaneous connections for images, with a third request coming immediately following the first completed download. Further requests won't occur until the 29K javascript file is downloaded. From there, 1 additional request is made for a 2795 byte css file. Once the CSS file is parsed, 4 more requests are to be made, using two simultaneous connections. The downloads happen according to the following table:

wwwgooglest1st2(connection 1)st2(connection 2
index.html    
 show_ads.jsjavascriptfeature photorss.gif
 ads?...  search.gif
    kallestad-logo.gif
   CSS File 
   frost_hz.jpgsdk-header.jpg
   frost.jpg 
   border.jpg 

In a linear scenario, requests are prioritized as follows

index-test.html3727
stock-photo.jpg5822
show_ads.js2314
rss.gif481
kallestad-logo.gif507
search.gif529
sdksite-rc-12.js29252
ads?2603
sdk-combined.css2795
sdk-header.jpg7061
frost_hz.jpg321
frost-1-0-1384
border-1-0-0559

Looking at these to tables, a further server split would only be effective with keepalives turned completely off, and even then the effectiveness is minimized because the potential server side savings is a matter of less than 1K in traffic. The discussion forum could benefit much more because of the large amount of icon files dispersed throughout the application.

In determining the ideal keepalive time, you also have to consider the cost of overhead. An average 450 byte request header, multiplied by 14 requests adds 6300 bytes to the mix, which potentially adds up to 2 seconds to the overall transaction in the dialup world. You could guestimate the overall potential page size at a large number, like 85K and push the keepalive values to 30 seconds, but that causes potential traffic issues during high load times. A better scenario is to take the number down to 5 seconds so that dialup users with good connections can take advantage of keepalive handshake minimization, while at the same time your server doesn't maintain large amounts of unused open tcp connections during peak load times. I know this conflicts a little bit with my earlier statements, but the idea of a 30 second keepalive with the potential for 1000 users coming in short succession a peak load would mean that you would be maintaining literally thousands of never to be used open connections for large periods of time.

Actually, now that I think about it and looking at the tables above... the bursts really happen in groups of three, and those bursts of three should all be downloaded within 1 second on even a pretty bad dialup connection. The keepalive setting should be max 1 second, max 3 requests. The TCP handshake for larger files would be a negligable addition in traffic, dialup users from geographically disparate locations benefit, and the server load does not increase dramatically during peak usage periods.

The two second increase because of http headers is something that is bothersome, but the only thing to alleviate that is to reduce the number of http requests. With proper cache control, consecutive page views should yield only a single request to my own servers, coupled with one or two requests to google and those page loads should occur within 1 second. I have been seeing IE make at least a 304 request for javascript files, but that is just a simple transfer that should complete itself within 150 miliseconds at the outside.

Sorry for the back and forth about keep alives... this is a blog, after all :)




03-01-2007, 03:05 PM  
Bjorn
 
 
What is right?

Keepalives and persistence are a confusing subject where performance is concerned. Standard advice is to turn keepalives off for high traffic sites. But keepalives are supposed to make things faster. I don't really understand it at all. There is a lot of conflicting information out there.
  Reply With Quote

03-01-2007, 05:39 PM  
Dean
 
 
Re: Discussion: Optimal Management of Keepalives

The simple answer is to turn keepalives off. With keepalives on, slow clients can save themselves a few roundtrips to your server. If the timeouts are set too long, you may run into problems because of too many open connections.
  Reply With Quote


Optimal Management of Keepalives Commentary