12 min read

In this article by Dimitri Aivaliotis, author Mastering NGINX – Second Edition, The NGINX configuration file follows a very logical format. Learning this format and how to use each section is one of the building blocks that will help you to create a configuration file by hand. Constructing a configuration involves specifying global parameters as well as directives for each individual section. These directives and how they fit into the overall configuration file is the main subject of this article. The goal is to understand how to create the right configuration file to meet your needs.

(For more resources related to this topic, see here.)

The basic configuration format

The basic NGINX configuration file is set up in a number of sections. Each section is delineated as shown:

<section> {

    <directive> <parameters>;

}

It is important to note that each directive line ends with a semicolon (;). This marks the end of line. The curly braces ({}) actually denote a new configuration context, but we will read these as sections for the most part.

The NGINX global configuration parameters

The global section is used to configure the parameters that affect the entire server and is an exception to the format shown in the preceding section. The global section may include configuration directives, such as user and worker_processes, as well as sections, such as events. There are no open and closing braces ({}) surrounding the global section.

The most important configuration directives in the global context are shown in the following table. These configuration directives will be the ones that you will be dealing with for the most part.

Global configuration directives

Explanation

user

The user and group under which the worker processes run is configured using this parameter. If the group is omitted, a group name equal to that of the user is used.

worker_processes

This directive shows the number of worker processes that will be started. These processes will handle all the connections made by the clients. Choosing the right number depends on the server environment, the disk subsystem, and the network infrastructure. A good rule of thumb is to set this equal to the number of processor cores for CPU-bound loads and to multiply this number by 1.5 to 2 for the I/O bound loads.

error_log

This directive is where all the errors are written. If no other error_log is given in a separate context, this log file will be used for all errors, globally. A second parameter to this directive indicates the level at which (debug, info, notice, warn, error, crit, alert, and emerg) errors are written in the log. Note that the debug-level errors are only available if the –with-debug configuration switch is given at compilation time.

pid

This directive is the file where the process ID of the main process is written, overwriting the compiled-in default.

use

This directive indicates the connection processing method that should be used. This will overwrite the compiled-in default and must be contained in an events context, if used. It will not normally need to be overridden, except when the compiled-in default is found to produce errors over time.

worker_connections

This directive configures the maximum number of simultaneous connections that a worker process may have opened. This includes, but is not limited to, client connections and connections to upstream servers. This is especially important on reverse proxy servers—some additional tuning may be required at the operating system level in order to reach this number of simultaneous connections.

Here is a small example using each of these directives:

# we want nginx to run as user 'www'
user www;

# the load is CPU-bound and we have 12 cores
worker_processes  12;

# explicitly specifying the path to the mandatory error log
error_log  /var/log/nginx/error.log;

# also explicitly specifying the path to the pid file
pid        /var/run/nginx.pid;

# sets up a new configuration context for the 'events' module
events {

    # we're on a Solaris-based system and have determined that 
        nginx
    # will stop responding to new requests over time with the 
        default
    # connection-processing mechanism, so we switch to the 
        second-best
    use /dev/poll;
    # the product of this number and the number of 
        worker_processes
    # indicates how many simultaneous connections per IP:port pair 
        are
    # accepted
    worker_connections  2048;

}

This section will be placed at the top of the nginx.conf configuration file.

Using the include files

The include files can be used anywhere in your configuration file, to help it be more readable and to enable you to reuse parts of your configuration. To use them, make sure that the files themselves contain the syntactically correct NGINX configuration directives and blocks; then specify a path to those files:

include /opt/local/etc/nginx/mime.types;

A wildcard may appear in the path to match multiple files:

include /opt/local/etc/nginx/vhost/*.conf;

If the full path is not given, NGINX will search relative to its main configuration file.

A configuration file can easily be tested by calling NGINX as follows:

nginx -t -c <path-to-nginx.conf>

This command will test the configuration, including all files separated out into the include files, for syntax errors.

Sample configuration

The following code is an example of an HTTP configuration section:

http {

    include       /opt/local/etc/nginx/mime.types;

    default_type  application/octet-stream;

    sendfile on;

    tcp_nopush on;

    tcp_nodelay on;

    keepalive_timeout  65;

    server_names_hash_max_size 1024;

}

This context block would go after any global configuration directives in the nginx.conf file.

The virtual server section

Any context beginning with the keyword server is considered as a virtual server section. It describes a logical separation of a set of resources that will be delivered under a different server_name directive. These virtual servers respond to the HTTP requests, and are contained within the http section.

A virtual server is defined by a combination of the listen and server_name directives. The listen directive defines an IP address/port combination or path to a UNIX-domain socket:

listen address[:port];
listen port;
listen unix:path;

The listen directive uniquely identifies a socket binding under NGINX. There are a number of optional parameters that listen can take:

The listen parameters

Explanation

Comments

default_server

This parameter defines this address/port combination as being the default value for the requests bound here.

 

setfib

This parameter sets the corresponding FIB for the listening socket.

This parameter is only supported on FreeBSD and not for UNIX-domain sockets.

backlog

This parameter sets the backlog parameter in the listen() call.

This parameter defaults to -1 on FreeBSD and 511 on all other platforms.

rcvbuf

This parameter sets the SO_RCVBUF parameter on the listening socket.

 

sndbuf

This parameter sets the SO_SNDBUF parameter on the listening socket.

 

accept_filter

This parameter sets the name of the accept filter to either dataready or httpready.

This parameter is only supported on FreeBSD.

deferred

This parameter sets the TCP_DEFER_ACCEPT option to use a deferred accept() call.

This parameter is only supported on Linux.

bind

This parameter makes a separate bind() call for this address/port pair.

A separate bind() call will be made implicitly if any of the other socket-specific parameters are used.

ipv6only

This parameter sets the value of the IPV6_ONLY parameter.

This parameter can only be set on a fresh start and not for UNIX-domain sockets.

ssl

This parameter indicates that only the HTTPS connections will be made on this port.

This parameter allows for a more compact configuration.

so_keepalive

This parameter configures the TCP keepalive connection for the listening socket.

 

The server_name directive is fairly straightforward and it can be used to solve a number of configuration problems. Its default value is “”, which means that a server section without a server_name directive will match a request that has no Host header field set. This can be used, for example, to drop requests that lack this header:

server {

    listen 80;

    return 444;

}

The nonstandard HTTP code, 444, used in this example will cause NGINX to immediately close the connection.

Besides a normal string, NGINX will accept a wildcard as a parameter to the server_name directive:

  • The wildcard can replace the subdomain part: *.example.com
  • The wildcard can replace the top-level domain part: www.example.*
  • A special form will match the subdomain or the domain itself: .example.com (matches *.example.com as well as example.com)

A regular expression can also be used as a parameter to server_name by prepending the name with a tilde (~):

server_name ~^www.example.com$;
server_name ~^www(d+).example.(com)$;

The latter form is an example using captures, which can later be referenced (as $1, $2, and so on) in further configuration directives.

NGINX uses the following logic when determining which virtual server should serve a specific request:

  1. Match the IP address and port to the listen directive.
  2. Match the Host header field against the server_name directive as a string.
  3. Match the Host header field against the server_name directive with a wildcard at the beginning of the string.
  4. Match the Host header field against the server_name directive with a wildcard at the end of the string.
  5. Match the Host header field against the server_name directive as a regular expression.
  6. If all the Host headers match fail, direct to the listen directive marked as default_server.
  7. If all the Host headers match fail and there is no default_server, direct to the first server with a listen directive that satisfies step 1.

This logic is expressed in the following flowchart:

The default_server parameter can be used to handle requests that would otherwise go unhandled. It is therefore recommended to always set default_server explicitly so that these unhandled requests will be handled in a defined manner.

Besides this usage, default_server may also be helpful in configuring a number of virtual servers with the same listen directive. Any directives set here will be the same for all matching server blocks.

Locations – where, when, and how

The location directive may be used within a virtual server section and indicates a URI that comes either from the client or from an internal redirect. Locations may be nested with a few exceptions. They are used for processing requests with as specific configuration as possible.

A location is defined as follows:

location [modifier] uri {...}

Or it can be defined for a named location:

location @name {…}

A named location is only reachable from an internal redirect. It preserves the URI as it was before entering the location block. It may only be defined at the server context level.

The modifiers affect the processing of a location in the following way:

Location modifiers

Handling

=

This modifier uses exact match and terminate search.

~

This modifier uses case-sensitive regular expression matching.

~*

This modifier uses case-insensitive regular expression matching.

^~

This modifier stops processing before regular expressions are checked for a match of this location’s string, if it’s the most specific match. Note that this is not a regular expression match—its purpose is to preempt regular expression matching.

When a request comes in, the URI is checked against the most specific location as follows:

  • Locations without a regular expression are searched for the most-specific match, independent of the order in which they are defined.
  • Regular expressions are matched in the order in which they are found in the configuration file. The regular expression search is terminated on the first match. The most-specific location match is then used for request processing.

The comparison match described here is against decoded URIs; for example, a %20 instance in a URI will match against a “” (space) specified in a location.

A named location may only be used by internally redirected requests.

The following directives are found only within a location:

Location-only directives

Explanation

alias

This directive defines another name for the location, as found on the filesystem. If the location is specified with a regular expression, alias should reference captures defined in that regular expression. The alias directive replaces the part of the URI matched by the location such that the rest of the URI not matched will be searched for in that filesystem location. Using the alias directive is fragile when moving bits of the configuration around, so using the root directive is preferred, unless the URI needs to be modified in order to find the file.

internal

This directive specifies a location that can only be used for internal requests (redirects defined in other directives, rewrite requests, error pages, and so on.)

limit_except

This directive limits a location to the specified HTTP verb(s) (GET also includes HEAD).

Additionally, a number of directives found in the http section may also be specified in a location. Refer to Appendix A, Directive Reference, for a complete list.

The try_files directive deserves special mention here. It may also be used in a server context, but will most often be found in a location. The try_files directive will do just that—try files in the order given as parameters; the first match wins. It is often used to match potential files from a variable and then pass processing to a named location, as shown in the following example:

location / {

    try_files $uri $uri/ @mongrel;

}
location @mongrel {
    proxy_pass http://appserver;

}

Here, an implicit directory index is tried if the given URI is not found as a file and then processing is passed on to appserver via a proxy. We will explore how best to use location, try_files, and proxy_pass to solve specific problems throughout the rest of the article.

Locations may be nested except in the following situations:

  • When the prefix is =
  • When the location is a named location

Best practice dictates that regular expression locations be nested inside the string-based locations. An example of this is as follows:

# first, we enter through the root
location / {

    # then we find a most-specific substring
    # note that this is not a regular expression
    location ^~ /css {

        # here is the regular expression that then gets matched
        location ~* /css/.*.css$ {

        }

    }

}

Summary

In this article, we saw how the NGINX configuration file is built. Its modular nature is a reflection, in part, of the modularity of NGINX itself. A global configuration block is responsible for all aspects that affect the running of NGINX as a whole. There is a separate configuration section for each protocol that NGINX is responsible for handling. We may further define how each request is to be handled by specifying servers within those protocol configuration contexts (either http or mail) so that requests are routed to a specific IP address/port. Within the http context, locations are then used to match the URI of the request. These locations may be nested, or otherwise ordered to ensure that requests get routed to the right areas of the filesystem or application server.

Resources for Article:


Further resources on this subject:


LEAVE A REPLY

Please enter your comment!
Please enter your name here