11 min read

Data protection and security are essential for the continuity of business. Data protection is not recommended, but it is required by the legal system. Sensitive data, such as user information, email addresses, geographical addresses, and payment information, should be protected against any data breach. There are several other topics related to data security, such as data privacy, retention, and loss prevention. In this article, we will look at authentication best practices in PostgreSQL including PostgreSQL host-based authentication, and proxy authentication strategies. 

There are several levels of data protection, often defined in the data protection policy and by the country’s legal system. A data protection policy often defines data dissemination to other parties, users authorized to access the data, and so on. Data should be protected on different levels, including transferring and encrypting data on storage devices. Data security is a huge topic and often there are data security managers dedicated only to these tasks.

This article is an excerpt taken from the book ‘Learning PostgreSQL 11 – Third Edition’ by Andrey Volkov,  and Salahadin Juba. The book explores the latest features in PostgreSQL 11 and will get you up and running with building efficient PostgreSQL database solutions from scratch. 

Authentication in PostgreSQL

Authentication answers the question: Who is the user? PostgreSQL supports several authentication methods, including the following:

  • Trust: Anyone who can connect to the server is authorized to access the database/databases as specified in the pg_hba.conf configuration file. Often used to allow connection using Unix domain socket on a single user machine to access the database. This method can also be used with TCP/IP, but it is rare to allow connection from any IP address other than the localhost. 
  • Ident: This works by getting the client’s operating system user name from an ident server and then using it to access the database server. This method is recommended for closed networks where client machines are subject to tight controls by system administrators.
  • Peer: This works in a similar manner to ident, but the client’s operating system username is obtained from the kernel.
  • GSSAPI: GSSAPI is an industry standard defined in RFC 2743. It provides automatic authentication (single sign-on).
  • Lightweight Directory Access Protocol (LDAP): The LDAP server is used only to validate the username/password pairs.
  • Password authentication: There are three methods as follows:
    • SCRAM-SHA-256: The strongest authentication method, introduced in PostgreSQL 10. This method prevents password sniffing on untrusted connections. The default password authentication method is MD5 to use this feature, the configuration parameter   password_encryption    should be changed to  scram-sha-256
    • MD5: MD5 has known limitations such as pre-computed lookup tables to crack password hashes.  Also, MD5 has only 4 billion unique hashes.  Finally, MD5 computation is very fast, thus brute force password guessing does not require a lot of CPU resources.  For new applications, it is only recommended using scram-sha-256. Also, PostgreSQL provides the means to migrate from scram-sha-256.
    • Password: This is not recommended to be used since passwords are sent to the server in a clear text format. 

There are other authentication methods not covered; the full list of supported authentication methods can be found on the PostgreSQL website.

To understand authentication, you need to have the following information:

  • Authentication is controlled via a pg_hba.conf file, where hba stands for host-based authentication.
  • It is good to know the default initial authentication settings shipped with PostgreSQL distribution.
  • The pg_hba.conf file is often located in the data directory, but it can also be specified in the postgresql.conf configuration file.
  • When changing the authentication, you need to send a SIGHUP signal, and this is done via several methods based on the PostgreSQL platform. Note that the user who sends the signal should be a superuser or the postgres, or a root system user on the Linux distribution; again, this depends on the platform. Here is an example of several ways to reload the PostgreSQL configuration:
psql -U postgres -c "SELECT pg_reload_conf();"
sudo service postgresql reload
sudo /etc/init.d/postgresql reload
sudo Kill -HUP <postgres process id>
sudo systemctl reload postgresql-11.service
  • The order of the pg_hba.conf records or entries is important. The session connection is compared with the pg_hba.conf records one by one until it is rejected or the end of the configuration file is reached.
  • Finally, it is important to check the PostgreSQL log files to determine whether there are errors after configuration reload.

PostgreSQL pg_hba.conf

As in postgresql.conf, the pg_hba.conf file is composed of a set of records, lines can be commented using the hash sign, and spaces are ignored. The structure of the pg_hba.conf file record is as follows:

host_type database user [IP-address| address] [IP-mask] auth-method [auth-options]

The host_type part of this query can be the following:

  • Local: This is used in Linux systems to allow users to access PostgreSQL using a Unix domain socket connection.
  • Host: This is to allow connections from other hosts, either based on the address or IP address, using TCP/IP with and without SSL encryption.
  • Hostssl: This is similar to the host, but the connection should be encrypted using SSL.
  • Hostnossl: This is also similar to host, but the connection should not be encrypted.

The database part of the query is the name of the database that the user would like to connect to. For flexibility, you could also use a comma-separated list to specify several databases, or you could use all to indicate that the user can access all the databases in the database cluster. Also, the same user and same role values can be used to indicate that the database name is the same as the username, or the user is a member of a role with the same name as the database.

The user part of the query specifies the database user’s name; again, the all value matches all users. The IP address, address, and IP subnet mask are used to identify the host from where the user tries to connect. The IP address can be specified using a Classless Inter-Domain Routing (CIDR) or dot-decimal notation. Finally, the password authentication methods can be trusted, MD5, reject, and so on.

The following are some typical examples of configuring a PostgreSQL authentication:

  • Example 1: Any user on the PostgreSQL cluster can access any database using the Unix domain socket, as shown in the following database table:
Local all all trust
  • Example 2: Any user on the PostgreSQL cluster can access any database using the local loop back IP address, as shown in the following database table:
Host all all trust
host all all ::1/128 trust
  • Example 3: All connections that come from the IP address are rejected, and the connections that come from the range are accepted, as shown in the following database table:
#TYPE    DATABASE    USER        ADDRESS                METHOD
Host all all reject
Host all all trust

PostgreSQL provides a very convenient way to view the rules defined in the pg_hba.conf file by providing a view called pg_hba_file_rules as follows:

postgres=# SELECT row_to_json(pg_hba_file_rules, true) FROM pg_hba_file_rules limit 1;
{"line_number":84, +
"type":"local", +
"database":["all"], +
"user_name":["all"], +
"address":null, +
"netmask":null, +
"options":null, +
(1 row)

Listen addresses

The listen_addresses option is defined in postgresql.conf. The PostgreSQL listen_addresses connection setting is used to identify the list of IP addresses that the server should listen to from client applications. The listen_addresses are comma-separated lists of hostnames or IP addresses. Changing this value requires a server restart. In addition, the following should be noted:

  • The default value is localhost which restricts direct connections to PostgreSQL cluster from network..
  • Giving an empty list means that the server should accept only a Unix socket connection
  • The value * indicates all
It is a common mistake for developers new to PostgreSQL to forget to change the listen_address. If a developer forgets to change it and tries to connect to PostgreSQL using TCP/IP from the network, the following error will be raised:  

  Connection refused
  Is the server running on host <host_ip> and accepting
  TCP/IP connections on port 5432?

Authentication best practices

Authentication best practices depend on the whole infrastructure set up, the application’s nature, the user’s characteristics, data sensitivity, and so on. For example, the following setup is common for start-up companies: the database application, including the database server, is hosted on the same machine and only used from one physical location by intracompany users.

Often, database servers are isolated from the world using firewalls; in this case, you can use the SCRAM-SHA-256 authentication method and limit the IP addresses so that the database server accepts connections within a certain range or set. Note that it is important not to use a superuser or database owner account to connect to the database because if this account was hacked, the whole database cluster would be exposed.

If the application server—business logic—and database server are not on the same machine, you can use a strong authentication method, such as LDAP and Kerberos. However, for small applications where the database server and application are on the same machine, the SCRAM-SHA-256 authentication method and limiting the listen to address to the localhost might be sufficient.

To authenticate an application, it is recommended to use only one user and try to reduce the maximum number of allowed connections using a connection pooling software to better tune the PostgreSQL resources. Another level of security might be needed in the application of business logic to distinguish between different login users. For real-world users, LDAP or Kerberos authentication is more desirable.

Furthermore, if the database server is accessed from the outer world, it is useful to encrypt sessions using SSL certificates to avoid packet sniffing.

You should also remember to secure database servers that trust all localhost connections, as anyone who accesses the localhost can access the database server.

Role system and proxy authentication

Often, when designing an application, a login role is used to configure database connections and connection tools. Another level of security needs to be implemented to ensure that the user who uses the application is authorized to perform a certain task. This logic is often implemented in application business logic.

The database’s role system can also be used to partially implement this logic by delegating the authentication to another role after the connection is established or reused, using the SET SESSION AUTHORIZATION statement or SET ROLE command in a transaction block, as follows:

postgres=# SELECT session_user, current_user;
session_user | current_user
postgres | postgres
(1 row)

postgres=# SET SESSION AUTHORIZATION test_user;
postgres=> SELECT session_user, current_user;
session_user | current_user
test_user | test_user
(1 row)

The SET ROLE requires a role membership, while SET SESSION AUTHORIZATION requires superuser privileges. Allowing an application to connect as a superuser is dangerous because the SET SESSION AUTHORIZATION and SET ROLE commands can be reset using the RESET ROLE and RESET SESSION commands, respectively, thereby allowing the application to gain superuser privileges.

To understand how the PostgreSQL role system can be used to implement authentication and authorization, we will use the role system and the car portal application. In the car portal application, several groups of users can be classified as web_app_user, public_user, registered_user, seller_user, and admin_user. The web_app_user is used to configure business logic connection tools; the public_user, registered_user, and seller_user are used to distinguish users. The public_user group can access only public information, such as advertisements, but cannot add ratings as registered_user nor create advertisements, since seller_user. admin_user is a super role to manage all of the application’s content, such as filtering out spams and deleting the users that do not adhere to the website’s policies. When the car web portal application connects to the database, the web_app_user user is used. After this, car_portal invokes the SET ROLE command based on the user class. This authentication method is known as proxy authentication.

The following examples demonstrate how a role system can be used to implement proxy authentication. The first step is to create roles and assign role memberships and privileges, as follows:

GRANT SELECT ON car_portal_app.advertisement_picture, car_portal_app.advertisement_rating , car_portal_app.advertisement TO public_user;
GRANT public_user TO web_app_user;
GRANT USAGE ON SCHEMA car_portal_app TO web_app_user, public_user;

The NOINHERIT option for the web_app_user does not allow the user to inherit the permissions of role membership; however, web_app_user can change the role to a public user, as in the following example:

$ psql car_portal -U web_app_user

car_portal=> SELECT * FROM car_portal_app.advertisement;
ERROR: permission denied for relation advertisement
car_portal=> SET ROLE public_user;
car_portal=> SELECT * FROM car_portal_app.advertisement;
advertisement_id | advertisement_date | car_id | seller_account_id
(0 rows)

car_portal=> SELECT session_user, current_user;
session_user | current_user
web_app_user | public_user
(1 row)

In this article, we looked at several authentication methods in PostgreSQL such as password and trust.  Finally, we looked at the role system and proxy authentication. If you enjoyed reading the article and want to learn more, be sure to check out the book ‘‘Learning PostgreSQL 11 – Third Edition’

Read Next

How to handle backup and recovery with PostgreSQL 11 [Tutorial]

Handling backup and recovery in PostgreSQL 10 [Tutorial]

Understanding SQL Server recovery models to effectively backup and restore your database

Tech writer at the Packt Hub. Dreamer, book nerd, lover of scented candles, karaoke, and Gilmore Girls.