Migration from Apache to Lighttpd

0
182
6 min read

Now starting from a working Apache installation, what can Lighttpd offer us?

  • Improved performance for most cases (as in more hits per second)
  • Reduced CPU time and memory usage
  • Improved security

Of course, the move to Lighttpd is not a small one, especially if our Apache configuration makes use of its many features. Systems tied into Apache as a module may make the move hard or even impossible without porting the module to a Lighttpd module or moving the functionality into CGI programs, if possible.

We can ease the pain by moving in small steps. The following descriptions assume that we have one Apache instance running on one hardware instance. But we can scale the method by repeating it for every hardware instance.

When not to migrate
Before we start this journey, we need to know that our hardware and operating systems support Lighttpd, that we have root access (or access to someone who has), and that the system has enough space for another Lighttpd installation (yes, I know, Lighttpd should reduce space concerns, but I have seen Apache installations munching away entire RAID arrays). Probably, this only makes sense if we plan on moving a big percentage of traffic to Lighttpd. We also might make extensive use of Apache module, which means a complete migration would involve finding or writing suitable substitutes for Lighttpd.

Adding Lighttpd to the Mix

Install Lighttpd on the system that Apache runs on. Find an unused port (refer to a port scanner if needed) to set server.port to. For example, if port 4080 is unused on our system, we would look for server.port in our Lighttpd configuration and change it to:

server.port = 4080

If we want to use SSL, we should change all occurrences of the port 443 to another free port, say 4443. We assume our Apache is answering requests on HTTP port 80.

Now let’s use this Lighttpd instance as a proxy for our Apache by adding the following configuration:

server.modules = (
#...
"mod_proxy",
#...
)
#...
proxy.server = (
"" => ( # proxy everything
host => "127.0.0.1" # localhost
port => "80"
)
)

This tells our Lighttpd to proxy all requests to the server that answers on localhost, port 80, which happens to be our Apache server. Now, when we start our Lighttpd and point our browser to http://localhost:4080/, we should be able to see the same thing that our Apache is returning.

What is a proxy?
A Proxy stands in front of another object, simulating the object by relaying all requests to it. A proxy can change requests on the fly, filter requests, and so on. In our case, Lighttpd is the web server to the outside, whilst Apache will still get all requests as usual.

Excursion: mod_proxy

mod_proxy is the module that allows Lighttpd to relay requests to another web server. It is not to be confused with mod_proxy_core (of Lighttpd 1.5.0), which provides a basis for other interfaces such as CGI. Usually, we want to proxy only a specific subset of requests, for example, we might want to proxy requests for Java server pages to a Tomcat server. This could be done with the following proxy directive:

proxy.server = (
".jsp" => ( host => "127.0.0.1", port => "8080" )
# given our tomcat is on port 8080
)

Thus the tomcat server only serves JSPs, which is what it was built to do, whilst our Lighttpd does the rest.

Or we might have another server which we want to include in our Web presence at some given directory:

proxy.server = (
"/somepath" => ( host => "127.0.0.1", port => "8080" )
)

Assuming the server is on port 8080, this will do the trick. Now http://localhost/somepath/index.html will be the same as http://localhost:8080/index.html.

Reducing Apache Load

Note that as most Lighttpd directives, proxy.server can be moved into a selector, thereby reducing its reach. This way, we can reduce the set of files Apache will have to touch in a phased manner. For example, YouTube™ uses Lighttpd to serve the videos. Usually, we want to make Lighttpd serve static files such as images, CSS, and JavaScript, leaving Apache to serve the dynamically generated pages.

Now, we have two options: we can either filter the extensions we want Apache to handle, or we can filter the addresses we want Lighttpd to serve without asking Apache.

Actually, the first can be done in two ways. Assuming we want to give all addresses ending with .cgi and .php to Apache, we could either use the matching of proxy.server:

proxy.server = (
".cgi" => ( host = "127.0.0.1", port = "8080" ),
".php" => ( host = "127.0.0.1", port = "8080" )
)

or match by selector:

$HTTP['url'] =~ "(.cgi|.php)$" {
proxy.server = ( "" => ( host = "127.0.0.1", port = "8080" ) )
}

The second way also allows negative filtering and filtering by regexp — just use !~ instead of =~.

mod_perl, mod_php, and mod_python

There are no Lighttpd modules to embed scripting languages into Lighttpd (with the exception of mod_magnet, which embeds Lua) because this is simply not the Lighttpd way of doing things. Instead, we have the CGI, SCGI, and FastCGI interfaces to outsource this work to the respective interpreters.

Most mod_perl scripts are easily converted to FastCGI using CGI::Fast. Usually, our mod_perl script will look a lot like the following script:

use CGI;
my $q = CGI->new;
initialize(); # this might need to be done only once
process_query($q); # this should be done per request
print response($q); # this, too

Using the easiest way to convert to FastCGI:

use CGI:Fast # instead of CGI
while (my $q = CGI:Fast->new) { # get requests in a while-loop
initialize();
process_query($q);
print response($q);
}

If this runs, we may try to put the initialize() call outside of the loop to make our script run even faster than under mod_perl. However, this is just the basic case. There are mod_perl scripts that manipulate the Apache core or use special hooks, so these scripts can get a little more complicated to migrate.

Migrating from mod_php to php-fcgi is easier — we do not need to change the scripts, just the configuration. This means that we do not get the benefits of an obvious request loop, but we can work around that by setting some global variables only if they are not already set. The security benefit is obvious. Even for Apache, there are some alternatives to mod_php, which try to provide more security, often with bad performance implications.

mod_python can be a little more complicated, because Apache calls out to the python functions directly, converting form fields to function arguments on the fly. If we are lucky, our python scripts could implement the WSGI (Web Server Gateway Interface). In this case, we can just use a WSGI-FastCGI wrapper. Looking on the Web, I already found two: one standalone (http://svn.saddi.com/py-lib/trunk/fcgi.py), and one, a part of the PEAK project (http://peak.telecommunity.com/DevCenter/FrontPage). Otherwise, python usually has excellent support for SCGI.

As with mod_perl, there are some internals that have to be moved into the configuration (for example dynamic 404 pages, the directive for this is server.error-handler-405, which can also point to a CGI script). However, for basic scripts, we can use SCGI (either from http://www.mems-exchange.org/software/scgi/ or as a python-only version from http://www.cherokee-project.com/download/pyscgi/). We also need to change import cgi to import scgi and change CGIHandler and CGIServer to SCGIHandler and SCGIServer, respectively.

LEAVE A REPLY

Please enter your comment!
Please enter your name here