A typical HTTP request

To get a better picture of the possible delay incurred when using a web application firewall, it helps to understand the anatomy of a typical HTTP request, and what processing time a typical web page download will incur. This will help us compare any added ModSecurity processing time to the overall time for the entire request.

When a user visits a web page, his browser first connects to the server and downloads the main resource requested by the user (for example, an .html file). It then parses the downloaded file to discover any additional files, such as images or scripts, that it must download to be able to render the page. Therefore, from the point of view of a web browser, the following sequence of events happens for each file:

Connect to web server.
Request required file.
Wait for server to start serving file.
Download file.

Each of these steps adds latency, or delay, to the request. A typical download time for a web page is on the order of hundreds of milliseconds per file for a home cable/DSL user. This can be slower or faster, depending on the speed of the connection and the geographical distance between the client and server.

If ModSecurity adds any delay to the page request, it will be to the server processing time, or in other words the time from when the client has connected to the server to when the last byte of content has been sent out to the client.

Another aspect that needs to be kept in mind is that ModSecurity will increase the memory usage of Apache. In what is probably the most common Apache configuration, known as "prefork", Apache starts one new child process for each active connection to the server. This means that the number of Apache instances increases and decreases depending on the number of client connections to the server.As the total memory usage of Apache depends on the number of child processes running and the memory usage of each child process, we should look at the way ModSecurity affects the memory usage of Apache.

A real-world performance test

In this section we will run a performance test on a real web server running Apache 2.2.8 on a Fedora Linux server (kernel 2.6.25). The server has an Intel Xeon 2.33 GHz dual-core processor and 2 GB of RAM.

We will start out benchmarking the server when it is running just Apache without having ModSecurity enabled. We will then run our tests with ModSecurity enabled but without any rules loaded. Finally, we will test ModSecurity with a ruleset loaded so that we can draw conclusions about how the performance is affected. The rules we will be using come supplied with ModSecurity and are called the "core ruleset".

The core ruleset

The ModSecurity core ruleset contains over 120 rules and is shipped with the default ModSecurity source distribution (it's contained in the rules sub-directory). This ruleset is designed to provide "out of the box" protection against some of the most common web attacks used today. Here are some of the things that the core ruleset protects against:

Suspicious HTTP requests (for example, missing User-Agent or Accept headers)
SQL injection
Cross-Site Scripting (XSS)
Remote code injection
File disclosure

We will examine these methods of attack, but for now, let's use the core ruleset and examine how enabling it impacts the performance of your web service.

Installing the core ruleset

To install the core ruleset, create a new sub-directory named modsec under your Apache conf directory (the location will vary depending on your distribution). Then copy all the .conf files from the rules sub-directory of the source distribution to the new modsec directory:

mkdir /etc/httpd/conf/modsec
cp/home/download/modsecurity-apache/rules/modsecurity_crs_*.conf /
etc/httpd/conf/modsec

Finally, enter the following line in your httpd.conf file and restart Apache to make it read the new rule files:


# Enable ModSecurity core ruleset
Include conf/modsecurity/*.conf

Putting the core rules in a separate directory makes it easy to disable them—all you have to do is comment out the above Include line in httpd.conf, restart Apache, and the rules will be disabled.

Making sure it works

The core ruleset contains a file named modsecurity_crs_10_config.conf. This file contains some of the basic configuration directives needed to turn on the rule engine and configure request and response body access. Since we have already configured these directives, we do not want this file to conflict with our existing configuration, and so we need to disable this. To do this, we simply need to rename the file so that it has a different extension as Apache only loads *.conf files with the Include directive we used above:

$ mv modsecurity_crs_10_config.conf modsecurity_crs_10_config.conf.
disabled

Once we have restarted Apache, we can test that the core ruleset is loaded by attempting to access an URL that it should block. For example, try surfing to http://yourserver/ftp.exe and you should get the error message Method Not Implemented, ensuring that the core rules are loaded.

Performance testing basics

So what effect does loading the core ruleset have on web application response time and how do we measure this? We could measure the response time for a single request with and without the core ruleset loaded, but this wouldn't have any statistical significance—it could happen that just as one of the requests was being processed, the server started to execute a processor-intensive scheduled task, causing a delayed response time.

The best way to compare the response times is to issue a large number of requests and look at the average time it takes for the server to respond.

An excellent tool—and the one we are going to use to benchmark the server in the following tests—is called httperf. Written by David Mosberger of Hewlett Packard Research Labs, httperf allows you to simulate high workloads against a web server and obtain statistical data on the performance of the server. You can obtain the program at http://www.hpl.hp.com/research/linux/httperf/ where you'll also find a useful manual page in the PDF file format and a link to the research paper published together with the first version of the tool.

Using httperf

We'll run httperf with the options --hog (use as many TCP ports as needed), --uri/index.html (request the static web page index.html) and we'll use --num-conn 1000 (initiate a total of 1000 connections). We will be varying the number of requests per second (specified using --rate) to see how the server responds under different workloads.

This is what the typical output from httperf looks like when run with the above options:

$ ./httperf --hog --server=bytelayer.com --uri /index.html --num-conn
1000
 --rate 50
Total: connections 1000 requests 1000 replies 1000 test-duration
20.386 s
Connection rate: 49.1 conn/s (20.4 ms/conn, <=30 concurrent
connections)
Connection time [ms]: min 404.1 avg 408.2 max 591.3 median 404.5
stddev 16.9
Connection time [ms]: connect 102.3
Connection length [replies/conn]: 1.000
Request rate: 49.1 req/s (20.4 ms/req)
Request size [B]: 95.0
Reply rate [replies/s]: min 46.0 avg 49.0 max 50.0 stddev 2.0 (4
samples)
Reply time [ms]: response 103.1 transfer 202.9
Reply size [B]: header 244.0 content 19531.0 footer 0.0 (total
19775.0)
Reply status: 1xx=0 2xx=1000 3xx=0 4xx=0 5xx=0
CPU time [s]: user 2.37 system 17.14 (user 11.6% system 84.1% total
95.7%)
Net I/O: 951.9 KB/s (7.8*10^6 bps)
Errors: total 0 client-timo 0 socket-timo 0 connrefused 0 connreset 0
Errors: fd-unavail 0 addrunavail 0 ftab-full 0 other 0

The output shows us the number of TCP connections httperf initiated per second ("Connection rate"), the rate at which it requested files from the server ("Request rate"), and the actual reply rate that the server was able to provide ("Reply rate"). We also get statistics on the reply time—the "reply time – response" is the time taken from when the first byte of the request was sent to the server to when the first byte of the reply was received—in this case around 103 milliseconds. The transfer time is the time to receive the entire response from the server.

The page we will be requesting in this case, index.html, is 20 KB in size which is a pretty average size for an HTML document. httperf requests the page one time per connection and doesn't follow any links in the page to download additional embedded content or script files, so the number of such links in the page is of no relevance to our test.

Getting a baseline: Testing without ModSecurity

When running benchmarking tests like this one, it's always important to get a baseline result so that you know the performance of your server when the component you're measuring is not involved. In our case, we will run the tests against the server when ModSecurity is disabled. This will allow us to tell which impact, if any, running with ModSecurity enabled has on the server.

Response time

The following chart shows the response time, in milliseconds, of the server when it is running without ModSecurity. The number of requests per second is on the horizontal axis:

ways-improve-performance-your-server-modsecurity-25-img-0

As we can see, the server consistently delivers response times of around 300 milliseconds until we reach about 75 requests per second. Above this, the response time starts increasing, and at around 500 requests per second the response time is almost a second per request. This data is what we will use for comparison purposes when looking at the response time of the server after we enable ModSecurity.

Memory usage

Finding the memory usage on a Linux system can be quite tricky. Simply running the Linux top utility and looking at the amount of free memory doesn't quite cut it, and the reason is that Linux tries to use almost all free memory as a disk cache. So even on a system with several gigabytes of memory and no memory-hungry processes, you might see a free memory count of only 50 MB or so.

Another problem is that Apache uses many child processes, and to accurately measure the memory usage of Apache we need to sum the memory usage of each child process. What we need is a way to measure the memory usage of all the Apache child processes so that we can see how much memory the web server truly uses.

To solve this, here is a small shell script that I have written that runs the ps command to find all the Apache processes. It then passes the PID of each Apache process to pmap to find the memory usage, and finally uses awk to extract the memory usage (in KB) for summation. The result is that the memory usage of Apache is printed to the terminal.

The actual shell command is only one long line, but I've put it into a file called apache_mem.sh to make it easier to use:

#!/bin/sh
# apache_mem.sh
# Calculate the Apache memory usage
ps -ef | grep httpd | grep ^apache | awk '{ print $2 }' | 
xargs pmap -x | grep 'total kB' | awk '{ print $3 }' | 
awk '{ sum += $1 } END { print sum }'

Now, let's use this script to look at the memory usage of all of the Apache processes while we are running our performance test. The following graph shows the memory usage of Apache as the number of requests per second increases:

ways-improve-performance-your-server-modsecurity-25-img-1

Apache starts out consuming about 300 MB of memory. Memory usage grows steadily and at about 150 requests per second it starts climbing more rapidly.

At 500 requests per second, the memory usage is over 2.4 GB—more than the amount of physical RAM of the server. The fact that this is possible is because of the virtual memory architecture that Linux (and all modern operating systems) use. When there is no more physical RAM available, the kernel starts swapping memory pages out to disk, which allows it to continue operating. However, since reading and writing to a hard drive is much slower than to memory, this starts slowing down the server significantly, as evidenced by the increase in response time seen in the previous graph.

CPU usage

In both of the tests above, the server's CPU usage was consistently around 1 to 2%, no matter what the request rate was. You might have expected a graph of CPU usage in the previous and subsequent tests, but while I measured the CPU usage in each test, it turned out to run at this low utilization rate for all tests, so a graph would not be very useful. Suffice it to say that in these tests, CPU usage was not a factor.

ModSecurity without any loaded rules

Now, let's enable ModSecurity—but without loading any rules—and see what happens to the response time and memory usage. Both SecRequestBodyAccess and SecResponseBodyAccess were set to On, so if there is any performance penalty associated with buffering requests and responses, we should see this now that we are running ModSecurity without any rules.

The following graph shows the response time of Apache with ModSecurity enabled:

ways-improve-performance-your-server-modsecurity-25-img-2

We can see that the response time graph looks very similar to the response time graph we got when ModSecurity was disabled. The response time starts increasing at around 75 requests per second, and once we pass 350 requests per second, things really start going downhill.

The memory usage graph is also almost identical to the previous one:

ways-improve-performance-your-server-modsecurity-25-img-3

Apache uses around 1.3 MB extra per child process when ModSecurity is loaded, which equals a total increase of memory usage of 26 MB for this particular setup. Compared to the total amount of memory Apache uses when the server is idle (around 300 MB) this equals an increase of about 10%.

Mod Security with the core ruleset loaded

Now for the really interesting test we'll run httperf against ModSecurity with the core ruleset loaded and look at what that does to the response time and memory usage.

Response time

The following graph shows the server response time with the core ruleset loaded:

ways-improve-performance-your-server-modsecurity-25-img-4

At first, the response time is around 340 ms, which is about 35 ms slower than in previous tests. Once the request rate gets above 50, the server response time starts deteriorating. As the request rates grows, the response time gets worse and worse, reaching a full 5 seconds at 100 requests per second. I have capped the graph at 100 requests per second, as the server performance has already deteriorated enough at this point to allow us to see the trend.

We see that the point at which memory usage starts increasing has gone down from 75 to 50 requests per second now that we have enabled the core ruleset. This equals a reduction in the maximum number of requests per second the server can handle of 33%.