---divider---

---divider---



Measuring and Tuning Apache

ABSTRACT

Ever wondered what your web server is up to? If you've ever had to mind a large e-commerce site or even a smaller site where real money is at stake based on your administration skills then you've wondered how well or poorly your servers are performing. You may have sat in the days before Christmas as the load of marauding shoppers sweep past your site wondering how the load on the server compares to that of the day before. Will you have enough capacity or will things go wrong? Is there any tuning you can do to stay ahead of the load?


1. Introduction

Apache is one of the success stories of the modern internet. At the time writing, the monthly Netcraft survey of the Web suggests that some 60-70% of all web sites are hosted on Apache Web servers and most of those on open source platforms of one sort or another.

Most of these web sites are of course not high volume, but some are. And it is for when you find yourself administering such a site that this paper is written.

This paper divides into a few basic sections. The first is a discussion of how to tune Apache for speed. Like most modern software Apache has lots of options that allows it to be deployed in many different ways. But, its worth noting that not all of the options that are available make sense to use in all situations. Some are merely poor practice whilst others are downright bad practice.

The next section, touches on the things we do in our web content that may or may not be wise. This includes static content, poorly written dynamic content and indeed options that may affect the delivery speed of either.

Finally we'll talk about measuring Apache performance. This is an area that frankly isn't talked about a great deal but should be. Notably most of the tools for measuring Apache performance are offline tools with little, almost nothing available to provide real-time information. In an attempt to address this lack, I also present a simple Java applet that allows some insight into what your web server is up to.

Before we move on it may be worth briefly offering some advice on how not to read this paper, or perhaps how not to act on this paper. In particular, my advice is to read it all first, then perhaps install the Apache HitMeter applet and then before doing anything else establish a baseline for how fast or slow your web server runs, as it is. This will involve you also perhaps doing some log post-processing to determine what the web server's typical workload looks like and will if nothing else allow you to judge whether you're getting any benefit from the tuning you're attempting. It may also allow you answer other questions like, how much headroom is there in your server capacity on the load it currently has to deal with, and when do you need to think about upgrades.

2. Tuning Apache

As we said up front, Apache, like much modern software has lots of knobs you can twiddle with. Some can be beneficial in all environments, some in most and some will depend on what you are doing as to whether or not they offer an advantage or not. Lets look at some of the things you can consider.

2.1 Apache Modules

Apache is a modular web server these days. In is base, it provides a framework for web serving functionality and then various extra features are attached to that base in a modular manner. Whilst a lot of software uses metaphors like this, it is good software engineering practice after all, in the case of modern versions of Apache its also a physical reality of the way the software is structured. The extra functions are physically separate code modules and which can be completely de-configured from the server if their function is not required.

This then is the first place to look for performance gains. There are lots of Apache modules available. Certainly you will not have a use for all of them and that's even if we're just talking about the base set of modules that come in the base server distribution. We can say this with certainty as some of them have overlapping function and its a certainty that you don't need two, three of four ways of doing the same thing (with the inevitable opportunity for misconfiguration to result in behaviour you don't expect in some circumstances).

As it happens, each module you add to Apache will also cost you a slight performance penalty. So, it makes good sense to evaluate those modules with a much more jaundiced eye and weigh the benefits of possibly occasional use against the very real costs of their presence even when they're not used. The other obvious penalty for having modules in Apache is that they cost you memory. The more modules you add, the bigger the server binary image is when its sitting in memory and the more main memory it will consume. This is true of both the code of the server (which is likely shared across all the server instances) and the dynamically allocated data the server uses. Now consider that a busy server can easily have 100's of Apache process and the problem is exacerbated.

Its also worth noting in passing that its good security practice to eliminate unused modules from your server as that way you will avoid being compromised by any exploits which are built around functions of those modules. (i.e. if its not in the server, an attacker can't exploit bugs in it...)

In older Apache installations, the module mix of the server was determined by the builder of the server software at compile time and was hard to change. Typically, some form of requirements analysis drove the compile time decision making process to include or exclude various modules. In more recent times, the dynamic linking/loading facilities of Apache have allowed modules to be dynamically loaded or excluded at run time from the httpd.conf file. This can take much of the perspiration out of the task of choosing what is and isn't needed in the server, allowing new modules to be added in as a requirement for them is identified and perhaps more significantly allowing modules for which the requirement has passed to be de-configured.

2.2 Customized Web Servers

One of the outcomes of this process of course is that some combinations of modules actively undermine each others performance profiles. Think here of the difference between a server which has been optimized for speed against say a server carrying the mod_perl module to support dynamic content. One is lean and mean and the other has a single additional module with a 5Mb to 10Mb memory footprint.

The solution may be to run two or more quite different instances of Apache with each tuned to some combination of requirements and resource use. Good examples of this might be:

  • a web server with no CGI support and only static files for fast image serving
  • a web server with only servlet support
  • a web server with SSL support
  • a web server with mod_perl or mod_php for dynamic pages

Indeed the top bullet point there is a favourite of sites that do a lot of graphic heavy web pages. (Yes they often have a of lot flesh tones in their imagery...) In some instances, people run completely different web server software in some roles, specifically because of a perception that they are better suited to those roles. To my mind, with proper tuning and on capable hardware Apache can achieve maximal performance for you and there is often no gain and considerable disadvantage to complicating your life with two different products (and two sets of bugs, and two sets of testing, etc...). For anecdotal evidence, entry level Sun UltraSPARC II based systems can easily saturate 100Mb/s Ethernet with Apache and unencrypted content. (More on this later...)

It also bears noting that multiple web server instances can be configured on single systems so long as different IP address or ports can be used. Or they may be spread across different host systems to isolate them. This latter allows overall system loads of the different tasks to be more easily determined and tracked.

2.3 Host Name Lookups

One of the earliest performance problems people identified for Apache was in its abilities to log accesses. Typically, each access to the web server, that is, each HTTP GET/POST/HEAD request is recorded as a single line of one or more log files. Each line identifies who the request comes from as well as other data of interest such as when the access was made, what was accessed, the success or failure of the access, how many bytes were transferred and possibly more...

The identity of the requester was written as the system which made the request. To do this the server had to do a reverse name lookup of the IP address of the network connection that request arrived on. This reverse lookup takes time and the server process/thread stalls while it completes.

To alleviate this problem, a directive HostNameLookup was added to the httpd.conf file allowing this behaviour to be turned off and on. Typically, on many intranets, where reverse lookups can be reasonably quick and loads are lighter it is left enabled and it is disabled for servers attached to the wider internet where this may not be true. When the service is disabled, the same log information is still written, except the symbolic name is replaced by the IP address that was looked up from. This can result in significant improvement in the response times of Apache servers and the expected increase in total throughput this leads to.

To enable log files to be examined by humans, a program called logresolve is distributed as part of the Apache tools, which can post-process log files to replace the IP addresses with the more human friendly names that come out of the reverse DNS lookup. As this process is performed offline from the process of serving pages it does not affect the servers performance. Failing this, most log analysis packages will now also attempt to perform the IP address to name lookups if required.

Finally on the subject of name resolution we should briefly note that some configuration directives can also accept names rather than IP addresses as part of their syntax. Most of these are resolved once at start up and have little impact on the run time performance of the web server as a whole. Some though can cause greater penalties than might be immediately obvious. Of note here is is the "Allow from xxx" and "Deny from xxx" syntax which can be used in Directory and Location contexts amongst others. Given a line like Allow from metva.com for instance, any access to the resources being thus protected will require a reverse DNS lookup of every access to see if it is in the allowed domain. In fact, the documentation tells us that a forward lookup of the resolved name is also performed as a double check. The use of IP addresses or ranges will not incur this performance penalty and may thus be preferred.

2.4 Server Pool Management

(It should be noted that this paper deals for the most part of with the 1.3.x stream of Apache releases. One of the notable differences in the newer 2.x server stream is the use of lightweight threads, which will likely invalidate much of the following discussion.)

Conventional server systems such as those managed by the inetd process, run by waiting on an open socket and when a client connects forking a new child to handle the request. By its nature this means that the request must wait while a new process is created and for the new process to run, handle the request and answer. Often times, the child process then exits, returning its resources to the system.

The problem with this approach is precisely the delays associated with starting a new process and the effect this has on the throughput the client process can achieve. The common solution is to pre-fork the server child process(es) before the requests arrive, and then hand off each request to an idle child process. That is, a pool of servers is maintained with a master to coordinate the processing of requests by the children.

Since its earliest versions, Apache has had the ability to manage such a pool of server processes. Rather than manage a fixed size pool of servers, Apache can maintain a variably sized pool where more servers are started as cumulative load increases and excess idles servers are shut down as this load decreases.

In fact, Apache performs this dynamic sizing of its server pool fairly well without a great need for tuning. There are some controls which can be tuned for sites/servers which are either exceptionally busy or exceptionally idle, but other than these degenerate cases not much need be done.

The most important controls are the MaxClients configuration and the depth of the TCP listen queue. These affect how much load an Apache instance can deal with. MaxClients limits how large Apache can grow the pre-forked server pool and together with the TCP listen queue depth this controls how many incoming connections the web server can accommodate before failing completely. Typically settings are chosen to reflect the maximal capacity of the server host rather than sizing things according to expected load or other such, likely poor, guesses.

2.5 Content Negotiation

One of the abilities Apache has picked up is an ability to negotiate content types with a client browser. Most typically, this is used to select between equivalent sets of related content and most often languages are the offerings that are selected between. (i.e. view a website in English or German or Japanese or whatever).

The problem of course is that negotiation costs time and this translates to an overall performance hit. Not only does negotiation cost time, but the server itself has to read the filesystem, often scan directories, on each access merely to determine which content types are available to offer.

To achieve higher performance then, the solution sadly, is to forgo the content negotiation abilities of the server and revert to offering single content websites. If multiple content types are catered to (and clearly this is a good thing) then checking once in a server side script and using redirects to send the user to a single content type site or indeed allowing the user to explicitly choose between content types themselves. This latter must usually remain an option anyway as few users set their browsers correctly to choose between multiple content types correctly and users may choose to sample other content types when they feel a need (e.g. to try and glean meaning when faced with a poor translation...).

2.6 Page Access

Its worth briefly noting that Apache has some options which although very useful result in extra processing for each page hit and may need to be disabled for high performance servers.

Firstly, the .htaccess file allows the content provider(s) to override the access privileges of the content under each directory tree. This behaviour is enabled by the AllowOverides option to Apache. The difficulty though is that to process a request the server must now check for a .htaccess file (or whatever name has been configured) in each directory on the path to the page to be served and potentially process the contents of the file. As this processing is repeated essentially for each request, the overheads are quite high.

A similar issue exists with symbolic links. Apache may be configured to either follow symbolic links or not to. The latter is of course common advice for best security practice. The problem is though that in order to preclude symlinks the server must once again check each element of the path it is traversing to ensure that all are regular directories or a file (for the final element). Once again a high per request overhead. As ever, we must choose between a more security or more performance.

2.7 Caching Pages

Newer releases of Apache offer an ability to cache specified content pages or components. To do this the MMapFile directive may be used to specify page elements which the server is to cache in memory. This content is then available for immediate use rather than needing to be fetched from the filesystem first.

To make use of this facility the mod_mmap_static Apache module must be loaded and then the MMapFile directive is available in the configuration file. Typical use is MMapFile <path> with a complete path to the file to be cached. Note though that if the file is updated, the Server must be restarted to refresh the in-memory copy.

To use this optimization well, the administrator must have some knowledge of the content that is being served. Typically most log analysis packages can give you a reasonable idea of which pages and page elements are accessed the most, which are those for which the most benefit can be obtained through caching. A good first approximation is to look for page elements that are common to many areas of your content such as organizational logos or navigation bars or parts thereof.

2.8 More Hardware?

A final and perhaps most obvious way of speeding up your web server is of course to buy a more capable server host system or to upgrade one or more portions of the host your server is currently using. Common solutions here are faster machines (e.g. higher clock rates), bigger machines (e.g. more CPUs), more machines (e.g. clusters) and off-board extras such as load balancers, crypto accelerators and reverse proxies.

The biggest caveat here is that as the number of systems increases the administrative load also rises. In fact, there can be real issues around the high workload involved in keeping multiple systems identically configured and the content on them synchronized. Once the number of systems is more than a couple, the implementation of automation to assist with this is highly desirable.

Not all of this needs to be bad news though. Clustered servers with redundant load balancing hardware can offer good fault resilience and may form the core of a high availability server farm.

2.9 Newer software

As noted earlier, this paper deals mostly with the 1.3.x release stream of Apache. There is good reason to expect that considerable performance gains will be realized by the change to the lightweight thread model that the Apache 2.x stream offers. In fact, there may well be other performance benefits which result from the re-engineering efforts being undertaken by the server development group.

The Apache Group now advises, I believe, that the the 2.x stream is ready for use in production environments. The sticking point, as is so often the case, may well be 'other' software which you are relying on which must work in conjunction with Apache which may not yet be aware of the 2.x stream servers. Examples here would be special purpose Apache modules, databases, servlet environments or similar.

3. Tuning our content

The next most obvious place to look at for performance of web based content is the content itself. Note that most often this is a case of not doing dumb things which adversely affect performance.

3.1 Static vs. Dynamic Pages

The first thing to consider is the difference between static and dynamic pages. Clearly many web sites use dynamically generated pages to provide interactivity of some sort. Static pages can of course be served very quickly, rates of 100's of pages/sec are routine and with appropriate networks 1000 pages/sec may be achieved by a small server. Certainly, these rates are special cases for a LAN and for a real web site the external link bandwidth may well be the rate limiting factor.

Dynamic pages present a very different profile though. Often many factors interact to limit the generation of dynamic pages such as time to load an interpreter (e.g. perl), time to search a database, execution speed of a servlet and speed of middleware. All these often combine to drop dynamic page rates into the single figure page/sec regime for the same hardware.

There are a number of strategies which may yield improvement over these issues. The simplest is to pre-generate common dynamic pages, effectively making the content static once again. If some dynamicism is required then it may be sufficient to generate the page at regular intervals (say tracking a stock price at half hour intervals for instance). The biggest effect here by the way is that a single dynamic page generator runs periodically rather than one per server child process. To see the effect of this you need only consider the run time costs of 500 (say) copies of perl competing against each other.

The other cost here of course is that while Apache server processes are pre-forked, that command processors like perl are being forked at the time of the request again. The solution is to use a dynamic page generator which does not incur the process startup costs per page impression. Most typically, we would choose to use mod_perl or PHP, both of which exist as modules inside Apache and hence benefit from the pre-forking of servers ahead of their need once again.

There is of course some small irony here though that after we recommended removal of extra modules to trim the workload & size of the Apache binary and its in-memory footprint, we now add back in the biggest modules that are out there. The moral of this story is stay with static pages wherever you can.

3.2 SSL

SSL is one of those areas that concerns all e-commerce sites. Clearly, its not possible to credibly run an e-commerce site without the use of strong encryption. But at the same time strong encryption is by design a computationally intense task (to help resist brute force attacks). This then leads to the situation where on a busy e-commerce server, SSL can be the biggest consumer of the host servers computational resources. SSL hit rates which most servers can offer are generally at least an order of magnitude less than the rate that unencrypted pages can be served.

This then leads to a natural thought for encrypted content, which is to only encrypt those portions of a page that need to be protected. The plan here is to leave things like navigation bars and logos unencrypted because there is no benefit to delivering them in an encrypted manner. Unfortunately, most modern browsers, in an effort to protect their users against malicious content or content from incompetent developers will 'warn' their users through the agency of a dialog box when pages consisting of mixed, encrypted and plain content are delivered. As the user is typically in the process of doing something where he or she is being asked to place trust in the web site owner, the effect of this is to unsettle the casual web surfer, typically at the exact moment when this is least desirable for the web site owner. Thus, this solution is not a practical one, as while it increases your system performance it typically also results in business being lost when some portion of the surfing community elects to abandon the transaction they had planned due to the now unsettled state they find themselves in.

It is however possible to deliver only those pages which require strong encryption in such an encrypted manner, leaving say the bulk of an e-commerce site unencrypted and only payment pages or pages exposing users confidential information protected (depending how strong the privacy laws of your jurisdiction are). Once again note that most modern browsers will alert their users when a transition to or from encrypted content occurs. Note also that some sites will find it desirable to deliver extra content encrypted. Most typically we see this in places like the page which renders the credit card entry HTML FORM for the user for instance. Nothing about this page itself requires encryption. The ACTION URL which the form data is sent to needs to be an SSL page, so that the form data is encrypted in transit from the browser to the server. But, web surfers are taught to check the padlock icon or similar status indicator which indicates the use of encryption prior to offering sensitive data such as their credit card number online. Whilst the page itself does not require SSL to be offered to the user, it typically must be so that the user can be re-assured that adequate encryption is being used to protect their private data. Similarly, the transition back to unencrypted pages must be planned so that it clearly takes place after the transfer of this data is complete. (Once again, is this were immediately after the user form data were received, then the user experience would be to submit their form with their credit card data only to see the browser dialog box warning of the transition back to plaintext content.)

Having worked out where and when to use encrypted content, its worth making sure you don't make your lives any more difficult than necessary. This means that things like the SSL session cache must be sized suitably for the amount of traffic you expect to deal with. Larger sites should employ larger session caches than smaller ones. Sadly, there are no tools or statistics available to guide you in how much you need or how fully the currently configured cache size is being utilized.

In general it is best to err on the side of caution and specify more cache than you may feel is necessary simply because the optimization the cache offers is so great. An aside on SSL will perhaps explain this.

A generic SSL transfer, broadly, consists of two portions. A public key protected set of session keys and the bulk data that these session keys are protecting which typically uses a more efficient symmetric key cipher. The public key portion of the SSL processing is in fact the slowest of the conversation, with the symmetric cipher operating somewhat faster. This of course is the rationale of using the two ciphers to begin with. To further improve the performance of SSL though, a second and subsequent connection from the same browser can specify a session key which was already used in a previous SSL transfer. (With appropriate timeouts of course). Thus, the browser and server can avoid the costs of unnecessary public key cipher operations when they feel they do not need to do incur them. This is the purpose of the SSL session key cache and this of course is why a busy server should have a cache large enough to guarantee that the user who spends a few minutes hunting for his wallet or her purse can still benefit from the cached session key that was stored when their initial SSL transfers were done.

This is also why those of you running load balanced clusters of servers should ensure that you avail yourselves of 'sticky' connections where they are available, but only for SSL content. 'stickiness' is the notion that rather than distributing load randomly across a server pool, that subsequent connections from the same clients on a network (i.e. the users browser or an ISPs proxy) should be made back to the same server host in a load balanced cluster. This once again ensures that the same session keys can be re-used. Note that absent this, your load balancer could direct each access to a different host incurring the longer, computationally more intense public key exchange of session keys each time. With even two servers only 'hidden' behind a load balancer, the browser could find itself discarding its session key on each access (because the servers would each report that the key being offered by the browser on this access was unknown in the cache of that server) and falling back to generate and exchange a new session key, with the performance hit this entails.

It should probably be noted that for non-encrypted content, stickiness is almost never a desirable feature, as it tends to defeat the ideal of distributing the load across your server pool evenly. Stickiness can result in single servers being overloaded whilst relatively more idle peers sit nearby, in the name of better SSL performance. Thus, if there's no SSL, even this justification is absent and stickiness should not be used.

Another potential gotcha in terms of SSL performance comes in the use of so called 'client side certificates'. This is an option in most modern web servers and certainly in mod_ssl, the Apache SSL implementation. These certificates allows a much stronger form of authentication between the web server and the web client system and for the encryption of the inbound and outbound data streams with different session keys.

The cost though is that the computationally more expensive operation of encrypting the session keys (in RSA the encryption operation is more expensive in computational terms than the decryption operation as it happens) is now performed by the server as well as the client system. In the case of the clients, this is no big deal as they need only do it once for their own data and as it happens each client is a computer of its own to do it. The server of course benefits from no such distributed computing scaling and in fact suffers greatly from it.

The server may also incur additional overhead attempting the verification task of checking certificate signing authority signatures for the client side certificates it is offered. This is good security practice of course but also incurs an overhead. Once again, stickiness in your load balancers can at least ameliorate some of this processing by (hopefully) making it necessary only once per client system.

3.3 Separate Virtual Hosts for different uses

This is the content side of the discussion we had earlier about optimizing instances of Apache for different tasks. By splitting your site into different pieces, such as graphics, SSL and the like, and fetching those pieces of content from servers which have been specialized for those tasks you may gain performance benefits. At the very least you may gain the ability to measure each of these operations separately, rather than simply seeing a aggregate single figure of server load.

Note though, that to provide tuned Apache's we typically cannot use the Virtual Host facilities of the server but must rather build completely separate instances of the server (which may still be on the same physical server host).

In contrast, the measuring process is already facilitated merely by having separate virtual hosts, either as separate Apache instance or simply as virtual hosts. It may in fact be a useful strategy to build your content with such a breakdown in mind with simple Apache Virtual Hosts while your site is small and then as it grows deploy separate Apache instances on perhaps separate hosts to spread the load. The fact that you have prepared things by breaking things down across the separate hosts and indeed had an ability to break out the loads into separate numbers will both allow you to measure the load more effectively and to facilitate the migration of that load to the new host instance when you decide to do this.

3.4 Impact of other infrastructure

Its worth noting that your ability to service web loads is almost certainly impacted by things other than just the capacity of your web servers. Thus its worth taking a more holistic approach to the task of maximizing your performance for any given level of resourcing.

At the very least you should examine questions like, network congestion, both in your LANS, your DMZ and on your link to the outside world. Many people aren't even monitoring this sort of information. In most environments, the size of the link to the Internet at large, the size of your Internet connection is the single biggest rate limiting element. For any successful e-commerce site though, you may be losing trade by not being able to service customers.

Similarly, you will find it valuable to monitor load in critical pieces of infrastructure such as routers (do they need more memory or bigger CPUs?) and firewalls (likewise...).

Beyond the web servers themselves, you likely have backend databases and middleware servers which also participate in the running of the site and indeed in offering your service and transacting business. Its briefly worth noting here that the 'transaction' overhead of serving a web page is much lower than that of the typical database. This means that a small web server may as has been noted easily serve hundreds of pages a seconds whilst few but the very largest database servers can process hundreds of database updates a second. This means it pays to review the use of your database regularly as naively written web applications can easily offer more database load than can be comfortably served.

Another form of ever more common middleware server is the Java servlet engine. Java has captured a significant portion of the web e-commerce market as it is a high performance mechanism for delivering dynamically generated pages. The servlets themselves may be hosted on the same host as the web server or indeed on a different host. The loads offered by these servlet environments should also be tracked with care. Often the servlet host will have more middleware servlet hosts of its own to communicate with (often referred to as application servers in the industry nomenclature) and indeed may also be dealing with one or more databases or other data sources or data stores. Thus, communication issues should also be watched once again (network load, etc. but also use of sockets/file descriptors, tcp tuning and the like).

3.5 Dumb Content

As the administrator of a web server or servers you may or may not have any control over the content on your web servers. You almost certainly don't have the time to vet that content yourself. Typically content is updated seemingly randomly and seldom with any advance notice.

Sadly, it is possible for badly structured content to impact the performance of a web site significantly. In the simple case, relatively benign issues such as graphics being served from somewhere other than your lean/mean server optimized for graphics or SSL being served from an Apache not tuned for it. Dynamic pages where little need for dynamicism is evident or even 5 copies of the same thing, when only one is marked to be cached by the server for efficient access.

In extreme cases, poorly written or ill-conceived dynamic content can bring a system to its knees. A common example here is database access from a highly trafficked dynamic page where no attempt it made to use a persistent connection to the database. Those of you experienced with databases will be aware of the significant overhead associated with connecting to a database and disconnecting from it. Doing this per page hit can cost a lot of unnecessary resources when a persistent connection would both perform more efficiently and offer better user response times.

4. Measuring Apache

Having looked at our content and organized our web servers, lets take a look at how we can characterize the performance of an Apache web server and indeed tune it.

4.1 Log analysis

The first thing to do is to periodically perform some analysis of the log files that Apache keeps for you. Analyzing the logs will typically show you a number of features about the load your web server is processing and can be done easily either with one of the myriad open source log analysis packages like webalizer or analog, or indeed commercial packages like WebTrends can be used.

For web sites with any sort of geographic locality, the first thing that is noticed is that like a bricks & mortar shop, your website sees most of its traffic during the business day. It seems that its a fact of life that currently most people do their web surfing from their places of work. If you offer a service which is tied to business hours in some manner this is even worse (e.g. stock information or the like).

The load graph for such sites is quite stark, with essentially no load in the early morning, a high load arriving the start of business hours running more or less constantly till midday when an upwards spike marks the highest load point for a typical day and then a relatively smooth decay away down to midnight. There is usually a small downward spike round the end of the business day as people commute back home and/or have their evening meals.

Sites without such locality, will typically see much smoother levels of constant load as web surfing communities all around the world come and go from their site (in overlapping versions of the pattern above as it happens).

Analysis of your logs can at least then tell you where your maximum periods of activity are, and when your idlest times are. this can be useful for running backups or doing system maintenance although it must be said, that even for sites with locality, there is seldom a time when no activity is taking place.

Statistics which may also be of use are hit rates per hour (not all hours are equal as we noted), rates of page failure per hour (where spikes reflect perhaps overloaded middleware or other infrastructure), dwell times per page for your users (are your pages too complicated) and page rates vs hit rates.

Its worth noting that a wealth of data can be extracted from your log files and it may even be worth writing some specialist log reduction programs in perl or a similar language suited to reportage to look at specific issues which may be of interest to you.

4.2 apachebench

One of the simplest questions raised once you start looking at your logs is how much capacity does you web server have? This is where this paper started in fact. Your Apache has some ability to service load that arrives from the Internet and when you're examining your log file the most obvious question is how much of that ability are you consuming and how close to running out are you?

Most people take a fairly rudimentary approach here of monitoring the hosts their web servers run on. This is of course a reflection of the fact that there are lots of great host monitoring tools out there and they may as well use them. But as we noted earlier, its not all about the host, with innate questions of how much work your web server can do, how much network load it can generate and so on.

A simple first step is to characterize the performance of your web server. Fire a bunch of web requests at it and see how quickly it can respond. Apache is even distributed with a tool called ApacheBench (although the binary has the unassuming name of 'ab' and its not well publicized) that allows you to generate loads and fire them at an instance of Apache.

Its fairly straightforward, and indeed can be very useful in getting some idea of the raw abilities of a server. Its easy to use with a straightforward command line interface. Its drawbacks are that it only exercises a single URL and it can't do SSL.

4.3 httperf

There is also a tool out there called httperf which does much the same sort of thing as ApacheBench except its more full featured. It has lots of options and is also much more anal about timing in its operation. This is of course a good thing in this sort of work.

The biggest disadvantage in the use of httperf is a somewhat cryptic command line interface. Its well worth the effort to come to grips with though. Like ApacheBench, httperf also cannot assist with benchmarking SSL pages nor does it allow you specify more than a single URL to test.

Of the two programs though, its certainly worth using ApacheBench if for no other reason than you already have it if you have Apache. The extra facilities of httperf and its extra rigour in timing makes it well worth the effort of finding and installing though.

4.4 http://server/server-status

Having benchmarked your server and analyzed your log files you now have some ideas of what your site is capable of and indeed what it gets up to from day to day. What is still lacking though is some idea of what the load looks like at any given point in time. By its nature, log analysis tends to take place offline and often well after the fact. Small sites may in fact only be doing it at months end. So how can you tell what your web site is up to right now?

The simplest facility is of course to ask the web server(s) for some statistics about its operation. There is an Apache module called mod_status which allows the server to keep some basic stats and offer them on a web page. By default, this page is http://yoursite/server-status although its name may be changed in the Apache configuration file.

The information presented is both summary, various totals for hits and bytes transferred and the like and an overview of the activity of the Apache server pool.

The biggest shortcoming of this data is that its accumulated since the last server start or restart. So, when the status page offers a server hit rate, it is an average rate since the last server restart. As we noted though unless your site is one blessed with a relatively constant load, this is unlikely to be of great use to you.

4.5 HitMeter

A solution then is to use the status page to access the raw totals and display these in some more useful manner. My contribution to this then was to write a small Java applet which could repeatedly access the status page and display the differences in total hits between successive readings as a hits per second rate. The display, was in the form of a speedometer like dial in its original form although the applet can now also offer bar graph and stripchart style displays.

This then gives us a real-time or at least near real-time look at how hard the Apache web server it is sampling is working. And it does it without any modifications being required to Apache itself.

The applet can be viewed at http://metva.com/hitmeter and I have made it open source under a BSD style license for those of you who'd like to avail yourselves of it.

In fact, the applet can also display other data, so long as it can download it from a page that 'looks like' the Apache server status page. Work is ongoing to make the data fetcher thread of the applet more flexible to allow it to read other non-HTTP data sources directly and to allow it to read data sources that look different from the Apache status page. It can be configured to display absolute data rather than differences (where data is already being presented as a rate) and can also in its current form display the Apache kb/s rate rather than the hit/s rate.

4.6 SSL benchmarking

Its worth briefly touching on the state of the art in benchmarking of SSL performance of web servers. In the open source arena, SSL benchmarking tools are still fairly thin on the ground. A number are said to be under development, indeed may by the time you read this have struggled out into the light of day, but by and large at the time of this writing, most SSL benchmarking still tends to be a little Mickey Mouse. (Often relying on scripts and tools like curl rather than something with the rigour of say httperf.)

The reasons for this are fairly straightforward. SSL is hard. Its hard to do right, its complex and likely its only low reward. The organizations who need it may fall back to some commercial benchmarking packages which have addressed the need to offer something, albeit at an often hefty price.

Having said that, there is at least one OpenSSL solution said to be under development for instance. OpenSSL is a good choice as it will remove a lot of the need to re-invent the (cryptographic) wheel, although once again, a correct solution is still far from trivial.

One of the other things to watch in any benchmarking tool is the need to balance the use of short and long SSL handshakes. That is, the SSL connection that establishes a new session key (the long handshake) and the SSL connection which re-uses a previously established session key (the short handshake). As we noted, real SSL traffic is a mix of the two types of traffic with we would hope a bias toward the latter type, but we need to be able to adjust this bias in our benchmarking tool as it is difficult to know precisely what that bias is for our users. If you have to pick only one use the long handshake as it offers you a (very) worst case performance figure.

This is one of those areas worth pursuing if you do much SSL on your website precisely because it can be so rate determining for its performance. And as we all know, sluggish websites are not pleasant experiences and often people will avoid them wherever possible. If your livelihood depends on the SSL performance of your website (as it well may), then this alone could make or break you.

4.7 Other benchmarking products

As I noted above, there are a number of commercial offerings in the arena of benchmarking web applications/sites. The highest profile of these is probably LoadRunner. This is a great product although the cost of entry can be quite high for some. It works on a record/replay style model where a 'typical' workload is recorded from a live browser session and a script for a this typical workload is developed (with accesses, dwell times and other such features of real users). This script is then replayed in parallel to simulate the actions of multiple users accessing a website.

LoadRunner sees a fair bit of use in the commercial world. Clearly its focus is users rather than hits or pages (and we can make an argument that this is a much more sensible figure of merit to be concerned with) but this does mean both that its hard to relate LoadRunner derived statistics to those a running web server offers or indeed that correlations between users and hits are all that clear.

Other benefits of packages like this though are to offer an ability to regression test web applications and perform other quality assurance type operations (like running test cases or indeed the load testing that gives LoadRunner its name) which more conventional server only benchmarking tools with their focus on only one or at best a few URLs cannot offer.

5. Summary

In any large enterprise capacity planning and benchmarking are vital tools of the trade. Oddly, in the world of the web though there is only poor focus on the capacity of a web server and almost no focus on knowing how fully it is utilized.

Some of this is born of the fact that unlike traditional datacentre models, web based service models have no real control of how much load is presented to them. In some instances, attractive, compelling web sites get no traffic whilst simultaneously ugly, expensive and poorly implemented sites are being overwhelmed. The vagaries of search engines, online advertising, link exchanges and other means of getting the word out mean that at times offered load bears little relationship to the attractiveness of the offerings of your site as such.

Over and above this, successful advertising can bring waves of sudden load spilling in which you may not be prepared for (marketing people on occasions will run campaigns without advising the service delivery portions of an organization). And finally, as alluded to in the abstract of this paper, there are natural seasonal variations in load which may see see your comfortable performance headroom consumed and put your site at risk of complete failure, merely because of its own popularity.

All of these things suggest that in order to properly manage a site, you must have some idea of both its capabilities and of to what extent those capabilities are being used. Where possible, you can tune your environments to maximize their performance (in all but the most profligate environments you will likely be doing this anyway) and then monitor their performance in an ongoing manner to ensure that you have adequate performance headroom to meet any new load that may arrive.

One final word of caution then about predicting the nature of the load you may face. Its very hard to predict web loads. As I noted, its not uncommon for sites to face sudden, unexpected and uncatered for loads arriving from the Internet. Anecdotal evidence abounds of large, well financed web sites which have manifestly failed in the past to scale to deal with the load offered them and no doubt more organizations will face the same problems in the future. All of these incidents represent lost revenue, and if they were properly managed sites, their current revenue should be sufficient to allow them to build infrastructure that can accept tomorrows loads. (More simply put, if you're just squeaking by, you're not doing it right...)

But crystal ball gazing is hard. It hard to know how much load is coming. Is Christmas twice as busy as August? What if its suddenly three times as busy. You need to plan. The first step is know what your site is capable of today and knowing what its being asked to deliver today. Then planning for tomorrow becomes easier...






---divider---