PostgreSQL committer Stephen Frost shares his vision for PostgreSQL version 12 and beyond

PostgreSQL version 12 was released in October this year and has earned a strong reputation for being reliable, feature robust, and performant. During the PostgreSQL Conference for the Asia Pacific, PostgreSQL major contributor and committer, Stephen Frost talked about a number of new features available in PostgreSQL 12 including Pluggable Storage, Partitioning and performance improvements as well as SQL features. In this post we have tried to cover a short synopsis of his Keynote. The full talk is available on YouTube.

Want to learn how to build your own PostgreSQL applications?

PostgreSQL 12 has an array of interesting features such as advanced indexing, high availability, database configuration, database monitoring, to efficiently manage and maintain your database. If you are a database developer and want to leverage PostgreSQL 12, we recommend you to go through our latest book Mastering PostgreSQL 12 - Third Edition written by Hans-Jürgen Schönig. This book examines in detail the newly released features in PostgreSQL 12 to help you build efficient and fault-tolerant PostgreSQL applications.

[dropcap]S[/dropcap]tephen Frost is the CTO of Crunchy Data. As a PostgreSQL major contributor he has implemented ‘roles support’ in version 8.1 to replace the existing user/group system, SQL column-level privileges in version 8.4, and Row Level Security in PostgreSQL 9.5. He has also spoken at numerous conferences, including pgConf.EU, pgConf.US, PostgresOpen, SCALE and others.

Stephen Frost on PostgreSQL 12 features

Pluggable Storage

This release introduces the pluggable table storage interface, which allows developers to create their own methods for storing data.

Postgres before version 12 had one storage engine - one primary heap. All indexes were secondary indexes which means that they referred directly into pointers on disk.

Also, this heap structure was row-based. Every row has a big header associated with the row which may cause issues when storing very narrow tables (which have two or fewer columns included in it.)

PostgreSQL 12 now has the ability to have multiple different storage formats underneath - the pluggable storage. This new feature is going to be the basis for columnar storage coming up probably in v13 or v14. It's also going to be the basis for Z heap - which is an alternative heap that is going to allow in-place updates and uses an undo log instead of using the redo log that PostgreSQL has.

Version 12 is building on the infrastructure for pluggable storage, and the team does not have anything user-facing yet. It's not going to be until v13 and later that they will actually have new storage mechanisms that are built on top of the pluggable storage feature.

Partitioning improvements

Postgres is adding in a whole bunch of new features to make working with partitions in Postgres easier to deal with.

Postgres 12 has major improvements in declarative partitioning capability which makes working with partitions more effective and easier to deal with. Partition selection has dramatically improved especially when selecting from a few partitions out of a large set.

Postgres 12 has the ability to ATTACH/DETATCH CONCURRENTLY. This is the ability to, on the fly, attach a partition and detach a partition without having to take very heavy locks. You can add new partitions to your partitioning scheme without having to take any outage or downtime or have any impact on the ongoing operations of the system.

This release also increases the number of partitions. The initial declarative partitioning patch made planning slow when you got over a few hundred partitions. This is now fixed with a much faster planning methodology for dealing with partitions.

Postgres 12 allows Multi-INSERT during COPY statements into partitioned tables. COPY is a way in which you can bulk load data into Postgres. This feature makes it much faster to COPY into partitioned tables. There is also a new function pg_partition_tree to display partition info.

Performance improvements and SQL features

Parallel Query with SERIALIZABLE

Parallel Query has been in Postgres since version 9.6 but it did not work with the serializable isolation level. Serializable is actually the highest level of isolation that you can have inside of Postgres. With Postgres 12, you have the ability to have a parallel query with serializable and have that highest level of isolation. This increases the number of places that parallel query can be used. This also allows application authors to worry less about concurrency because serializable in Postgres provides true serializability that exists in very few databases.

Faster float handling

Postgres 12 has a new library for converting floating point into text. This provides a significant speedup for many workloads where you're doing text-based transfer of data. Although it may result in slightly different (possibly more correct) output.

Partial de-TOAST/decompress a value

Historically to access any compressed TOAST value, you had to decompress the whole thing into memory. This was not very ideal in situations where you wanted access to, only the front of it.

Partial de-TOAST allows decompressing of a section of the TOAST value. This also gives a great improvement in performance for cases like:

PostGIS geometry/geography- data at the front can be used for filtering

Pulling just the start of a text string

COPY FROM with WHERE

Postgres 12 now has a WHERE clause supported by a COPY FROM statement. This allows you to filter data/records while importing. Earlier it was done using the file_fdw, but it was tedious as it required creating a foreign table.

Create or Replace Aggregate

This features allows an aggregate to either be created, if it does not exist, or replaced it it does. It makes extension upgrade scripts much simpler. This feature was requested specifically by the Postgres community.

Inline Common Table Expressions

Not having Inline CTEs was seen as an optimization barrier. From version 12, Postgres, by default, inlines the CTEs if it can. It also supports the old behavior so in the event that you actually want CTE to be an optimization barrier, you can still do that. You just have to specify WITH MATERIALIZED when you go to write your CTE.

SQL/JSON improvements

There is also progress made towards supporting the SQL/JSON standard

Added a number of json_path functions
- json_path_exists
- json_path_match
- json_path_query
- json_path_query_array
- json_path_query_first

Added new operators for working with json:
- jsonb @? Jsonpath - wrapper for jsonpath_exists
- jsonb @? Jsonpath - wrapper for jsonpath_predicate

Index support should also be added for these operators soon

Recovery.conf moved into postgresql.conf

Recovery.conf is no longer available in Postgresql 12 and all options are moved to postgresql.conf. This allows changing recovery parameters via ALTER SYSTEM.

This feature increases flexibility meaning that it allows changing primary via ALTER SYSTEM/reload. However, this is a disruptive change. Every high-level environment will change but it reduces the fragility of high-level solutions moving forward. A new pg_promote function is added to allow promoting a replica from SQL.

Control SSL protocol

With Postgres 12, you can now control SSL protocols. Older SSL protocols are required to be disabled for security reasons. They were enforced previously with FIPS mode. They are now addressed in CIS benchmark/ STIG updates.

Covering GIST indexes

GIST indexes can now also use INCLUDE. These are useful for adding columns to allow index-only queries. It also allows including columns that are not part of the search key.

Add CSV output mode to psql

Previously you could get CSV but you had to do that by taking your query inside a copy statement. Now you can use the new pset format for CSV output from psql. It returns data in each row in CSV format instead of tabular format.

Add option to sample queries

This is a new log_statement_sample_rate parameter which allows you to set log_min duration to be very very low or zero. Logging all statements is very expensive as it slows down the whole system and you end up having a backlog of processes trying to write into the logging system. The new log_statement_sample_rate parameter includes only a sample of those queries in the output rather than logging every query. The log_min_duration_statement excludes very fast queries. It helps with analysis in environments with lots of fast queries.

New HBA option called clientcert=verify-full

This new HBA option allows you to do a two-factor authentication where one of the factors is a certificate and the other one might be a password or something else (PAM, LDAP, etc). It gives you the ability to say that every user has to have a client-side certificate and that the client-side certificate must be validated by the server on connection and have to provide a password. It works with non-cert authentication methods and requires client-side certificates to be used.

In his talk, Stephen also answered commonly asked questions about Postgres, watch the full video to know more.

You can read about other performance and maintenance enhancements in PostgreSQL 12 on the official blog. To learn advanced PostgreSQL 12 concepts with real-world examples and sample datasets, go through the book Mastering PostgreSQL 12 - Third Edition by Hans-Jürgen Schönig.

Introducing PostgREST, a REST API for any PostgreSQL database written in Haskell

Percona announces Percona Distribution for PostgreSQL to support open source databases

Wasmer’s first Postgres extension to run WebAssembly is here!