Introduction to nginx

July 31, 2013 - 12:00 am

4938

8 min read

(For more resources related to this topic, see here.)

So, what is nginx?

The best way to describe nginx (pronounced engine-x) is as an event-based multi-protocol reverse proxy. This sounds fancy, but it’s not just buzz words and actually affects how we approach configuring nginx. It also highlights some of the flexibility that nginx offers. While it is often used as a web server and an HTTP reverse proxy, it can also be used as an IMAP reverse proxy or even a raw TCP reverse proxy. Thanks to the plug-in ready code structure, we can utilize a large number of first and third party modules to implement a diverse amount of features to make nginx an ideal fit for many typical use cases.

Learn Programming & Development with a Packt Subscription

Instant Nginx Starter

A more accurate description would be to say that nginx is a reverse proxy first, and a web server second. I say this because it can help us visualize the request flow through the configuration file and rationalize how to achieve the desired configuration of nginx. The core difference this creates is that nginx works with URIs instead of files and directories, and based on that determines how to process the request. This means that when we configure nginx, we tell it what should happen for a certain URI rather than what should happen for a certain file on the disk.

A beneficial part of nginx being a reverse proxy is that it fits into a large number of server setups, and can handle many things that other web servers simply aren’t designed for. A popular question is “Why even bother with nginx when Apache httpd is available?”

The answer lies in the way the two programs are designed. The majority of Apache setups are done using prefork mode, where we spawn a certain amount of processes and then embed our dynamic language in each process. This setup is synchronous, meaning that each process can handle one request at a time, whether that connection is for a PHP script or an image file.

In contrast, nginx uses an asynchronous event-based design where each spawned process can handle thousands of concurrent connections. The downside here is that nginx will, for security and technical reasons, not embed programming languages into its own process – this means that to handle those we will need to reverse proxy to a backend, such as Apache, PHP-FPM, and so on. Thankfully, as nginx is a reverse proxy first and foremost, this is extremely easy to do and still allows us major benefits, even when keeping Apache in use.

Let’s take a look at a use case where Apache is used as an application server described earlier rather than just a web server. We have embedded PHP, Perl, or Python into Apache, which has the primary disadvantage of each request becoming costly. This is because the Apache process is kept busy until the request has been fully served, even if it’s a request for a static file. Our online service has gotten popular and we now find that our server cannot keep up with the increased demand. In this scenario introducing nginx as a spoon-feeding layer would be ideal. When an nginx server with a spoon-feeding layer will sit between our end user and Apache and a request comes in, nginx will reverse proxy it to Apache if it is for a dynamic file, while it will handle any static file requests itself. This means that we offload a lot of the request handling from the expensive Apache processes to the more lightweight nginx processes, and increase the number of end users we can serve before having to spend money on more powerful hardware.

Another example scenario is where we have an application being used from all over the world. We don’t have any static files so we can’t easily offload a number of requests from Apache. In this use case, our PHP process is busy from the time the request comes in until the user has finished downloading the response. Sadly, not everyone in the world has fast internet and, as a result, the sending process could be busy for a relatively significant period of time. Let’s assume our visitor is on an old 56k modem and has a maximum download speed of 5 KB per second, it will take them five seconds to download a 25 KB gzipped HTML file generated by PHP. That’s five seconds where our process cannot handle any other request. When we introduce nginx into this setup, we have PHP spending only microseconds generating the response but have nginx spend five seconds transferring it to the end user. Because nginx is asynchronous it will happily handle other connections in the meantime, and thus, we significantly increase the number of concurrent requests we can handle.

In the previous two examples I used scenarios where nginx was used in front of Apache, but naturally this is not a requirement. nginx is capable of reverse proxying via, for instance, FastCGI, UWSGI, SCGI, HTTP, or even TCP (through a plugin) enabling backends, such as PHP-FPM, Gunicorn, Thin, and Passenger.

Quick start – Creating your first virtual host

It’s finally time to get nginx up and running. To start out, let’s quickly review the configuration file. If you installed via a system package, the default configuration file location is most likely /etc/nginx/nginx.conf. If you installed via source and didn’t change the path pre fix, nginx installs itself into/usr/local/nginx and places nginx.conf in a /conf subdirectory. Keep this file open as a reference to help visualize many of the things described in this article.

Step 1 – Directives and contexts

To understand what we’ll be covering in this section, let me first introduce a bit of terminology that the nginx community at large uses. Two central concepts to the nginx configuration file are those of directives and contexts. A directive is basically just an identifier for the various configuration options. Contexts refer to the different sections of the nginx configuration file. This term is important because the documentation often states which context a directive is allowed to have within.

A glance at the standard configuration file should reveal that nginx uses a layered configuration format where blocks are denoted by curly brackets {}. These blocks are what are referred to as contexts.

The topmost context is called main, and is not denoted as a block but is rather the configuration file itself. The main context has only a few directives we’re really interested in, the two major ones being worker_processes and user. These directives handle how many worker processes nginx should run and which user/group nginx should run these under.

Within the main context there are two possible subcontexts, the first one being called events. This block handles directives that deal with the event-polling nature of nginx. Mostly we can ignore every directive in here, as nginx can automatically configure this to be the most optimal; however, there’s one directive which is interesting, namely worker_connections. This directive controls the number of connections each worker can handle. It’s important to note here that nginx is a terminating proxy, so if you HTTP proxy to a backend, such as Apache httpd, that will use up two connections.

The second subcontext is the interesting one called http. This context deals with everything related to HTTP, and this is what we will be working with almost all of the time. While there are directives that are configured in the http context, for now we’ll focus on a subcontext within http called server. The server context is the nginx equivalent of a virtual host. This context is used to handle configuration directives based on the host name your sites are under.

Within the server context, we have another subcontext called location. The location context is what we use to match the URI. Basically, a request to nginx will flow through each of our contexts, matching first the server block with the hostname provided by the client, and secondly the location context with the URI provided by the client.

Depending on the installation method, there might not be any server blocks in the nginx.conf file. Typically, system package managers take advantage of the include directive that allows us to do an in-place inclusion into our configuration file. This allows us to separate out each virtual host and keep our configuration file more organized. If there aren’t any server blocks, check the bottom of the file for an includedirective and check the directory from which it includes, it should have a file which contains a server block.

Step 2 – Define your first virtual hosts

Finally, let us define our first server block!

server {
    listen 80;
    server_name example.com;
    root /var/www/website;
}

That is basically all we need, and strictly speaking, we don’t even need to define which port to listen on as port 80 is default. However, it’s generally a good practice to keep it in there should we want to search for all virtual hosts on port 80 later on.

Summary

This article provided the details about the important aspects of nginx. It also briefed about the configuration of our virtual host using nginx by explaining two simple steps, along with a configuration example.

Resources for Article :

Further resources on this subject:

Nginx HTTP Server FAQs [Article]
Nginx Web Services: Configuration and Implementation [Article]
Using Nginx as a Reverse Proxy [Article]