PostgreSQL 9 Admin Cookbook
Over 80 recipes to help you run an efficient PostgreSQL 9.0 database
- Administer and maintain a healthy database
- Monitor your database ensuring that it performs as quickly as possible
- Tips for backup and recovery of your database
PostgreSQL is a feature-rich general purpose database management system. It’s a complex piece of software, but every journey begins with the first step.
We start with your first connection. Many people fall at the first hurdle, so we try not to skip too swiftly past that. We move on quickly to enabling remote users, and from there to access through GUI administration tools.
We also introduce the psql query tool, which is the tool used for loading our sample database.
Introducing PostgreSQL 9
PostgreSQL is an advanced SQL database server, available on a wide range of platforms.
One of the clearest benefits of PostgreSQL is that it is open source, meaning that you have a very permissive license to install, use, and distribute PostgreSQL without paying anyone fees or royalties. On top of that, PostgreSQL is well-known as a database that stays up for long periods, and requires little or no maintenance in many cases. Overall, PostgreSQL provides a very low total cost of ownership.
PostgreSQL is also noted for its huge range of advanced features, developed over the course of more than 20 years continuous development and enhancement. Originally developed by the Database Research group at the University of California, Berkeley, PostgreSQL is now developed and maintained by a huge army of developers and contributors. Many of those contributors have full-time jobs related to PostgreSQL, working as designers, developers, database administrators, and trainers. Some, but not many, of those contributors work for companies that specialize in services for PostgreSQL, such as Hannu and me. No single company owns PostgreSQL, nor are you required, or even encouraged, to register your usage.
PostgreSQL has the following main features:
- Excellent SQL Standards compliance up to SQL 2008
- Client-server architecture
- Highly concurrent design where readers and writers don’t block each other
- Highly configurable and extensible for many types of application
- Excellent scalability and performance with extensive tuning features
What makes PostgreSQL different?
The PostgreSQL project focuses on the following objectives:
- Robust, high-quality software with maintainable, well-commented code
- Low maintenance administration for both embedded and enterprise use
- Standards-compliant SQL, interoperability, and compatibility
- Performance, security, and high availability
What surprises many people is that PostgreSQL’s feature set is more comparable with Oracle or SQL Server than it is with MySQL. The only connection between MySQL and PostgreSQL is that those two projects are open source; apart from that, the features and philosophies are almost totally different.
One of the key features of Oracle since Oracle 7 has been “snapshot isolation”, where readers don’t block writers, and writers don’t block readers. You may be surprised to learn that PostgreSQL was the first database to be designed with this feature, and offers a full and complete implementation. PostgreSQL names this Multi-Version Concurrency Control (MVCC).
PostgreSQL is a general-purpose database management system. You defi ne the database that you would like to manage with it. PostgreSQL offers you many ways to work. You can use a “normalized database model”, you can utilize extensions such as arrays and record subtypes, or you can use a fully dynamic schema using an extension named hstore . PostgreSQL also allows you to create your own server-side functions in one of a dozen different languages.
PostgreSQL is highly extensible, so you can add your own datatypes, operators, index types, and functional languages. For example, you can override different parts of the system using plugins to alter the execution of commands or add a new optimizer.
All of these features offer a huge range of implementation options to software architects. There are many ways out of trouble when building applications and maintaining them over long periods of time.
In the early days, when PostgreSQL was still a research database, the focus was solely on cool new features. Over the last 15 years, enormous amounts of code have been rewritten and improved, giving us one of the most stable, large, software servers available for operational use.
You may also read that PostgreSQL was, or is, slower than My Favorite DBMS, whichever one that is. It’s been a personal mission of mine over the last six years to improve server performance and the team have been successful in making the server highly performant and very scalable. That gives PostgreSQL enormous headroom for growth.
Who is using PostgreSQL? Prominent users include Apple, BASF, Genentech, IMDB.com, Skype, NTT, Yahoo, and The National Weather Service. PostgreSQL receives well in excess of 1 million downloads per year, according to data submitted to the European Commission, who concluded “…PostgreSQL, is considered by many database users to be a credible alternative…
We need to mention one last thing. When PostgreSQL was fi rst developed, it was named Postgres, and so many aspects of the project still refer to the word “postgres”. For example, the default database is named postgres, and the software is frequently installed using the postgres userid. As a result, people shorten the name PostgreSQL to simply Postgres, and in many cases people use the two names interchangeably.
PostgreSQL is pronounced as “post-grez-q-l”. Postgres is pronounced as “post-grez”.
Some people get confused, and refer to “Postgre”, which is hard to say, and likely to confuse people. Two names are enough, so please don’t use a third name!
PostgreSQL is 100% open source software.
PostgreSQL is freely available to use, alter, or redistribute in any way you choose. PostgreSQL’s license is an approved open source license very similar to the BSD (Berkeley Distribution Software) license, though only just different enough that it is now known as TPL (The PostgreSQL License).
How to do it…
PostgreSQL is already in use by many different application packages, and so you may already find it installed on your servers. Many Linux distributions include PostgreSQL as part of the basic installation, or include it with the installation disk.
One thing to be wary of is that the version of PostgreSQL included may not be the latest release. It will typically be the latest major release that was available when that operating system release was published. There is usually no good reason to stick at that level—there is no increased stability implied there and later production versions are just as well-supported by the various Linux distributions.
If you don’t yet have a copy, or you don’t have the latest version, you can download the source code or download binary packages for a wide variety of operating systems from the following URL:
Installation details vary significantly from platform-to-platform and there aren’t any special tricks or recipes to mention. Please, just follow the installation guide, and away you go. We’ve consciously avoided describing the installation processes here to make sure we don’t garble or override the information published to assist you.
If you would like to receive e-mail updates of the latest news, then you can subscribe to the PostgreSQL announce mailing list, which contains updates from all the vendors that support PostgreSQL. You’ll get a few e-mails each month about new releases of core PostgreSQL and related software, conferences, and user group information. It’s worth keeping in touch with developments.
For more information about the PostgreSQL announce mailing list, visit the following URL:
How it works…
Many people ask questions, such as “How can this be free?”, “Are you sure I don’t have to pay someone?”, or “Who gives this stuff away for nothing?”
Open source applications such as PostgreSQL work on a community basis, where many contributors perform tasks that make the whole process work. For many of those people, their involvement is professional, rather a hobby, and they can do this because there is generally a great value for both contributors and their employers alike.
You might not believe it. You don’t have to because It Just Works.
Remember that PostgreSQL is more than just the core software. There is a huge range of websites offering add-ons, extensions, and tools for PostgreSQL. You’ll also fi nd an army of bloggers describing useful tricks and discoveries that will help you in your work.
And, there is a range of professional companies able to offer you help when you need it.
Connecting to PostgreSQL server
How do we access PostgreSQL?
Connecting to the database is most people’s first experience of PostgreSQL, so we want to make it a good one. So, let’s do it, and fix any problems we have along the way. Remember that a connection needs to be made securely, so there may be some hoops for us to jump through to ensure that the data we wish to access is secure.
Before we can execute commands against the database, we need to connect to the database server, giving us a session.
Sessions are designed to be long-lived, so you connect once, perform many requests, and then eventually disconnect. There is a small overhead during connection. That may become noticeable if you connect/disconnect repeatedly, so you may wish to investigate the use of connection pools. Connection pools allow pre-connected sessions to be served quickly to you when you wish to reconnect.
First, catch your database. If you don’t know where it is, we’ll probably have difficulty accessing it. There may be more than one, and you’ll need to know the right database to access, and have the authority to connect to it.
How to do it…
You need to specify the following fi ve parameters to connect to PostgreSQL:
- host or host address
- database name
- password (or other means of authentication, if any)
To connect, there must be a PostgreSQL server running on host, listening on port number port. On that server, a database named dbname and user must also exist. The host must explicitly allow connections from your client—this is explained in the next recipe, and you must also pass authentication using the method the server specifi es. For example, specifying a password won’t work if the server has requested a different form of authentication.
Almost all PostgreSQL interfaces use the libpq interface library . When using libpq, most of the connection parameter handling is identical, so we can just discuss that once.
If you don’t specify the preceding parameters, we look for values set through environment variables, which are as follows:
- PGHOST or PGHOSTADDR
- PGPORT (or set to 5432 if this is not set)
- PGPASSWORD (though this one is defi nitely not recommended)
If you specify the first four parameters somehow, but not the password, then we look for a password file.
Some PostgreSQL interfaces use the client-server protocol directly, so the way defaults are handled may differ. The information we need to supply won’t vary signifi cantly, so please check the exact syntax for that interface.
How it works…
The PostgreSQL server is a client-server database. The system it runs on is known as the host. We can access the PostgreSQL server remotely through the network. However, we must specify the host, which is a hostname, or a hostaddr , which is an IP address. We can specify a host of “localhost” if we wish to make a TCP/IP connection to the same system. It is often better to use a Unix socket connection, which is attempted if the host begins with a slash (/) and the name is presumed to be a directory name (default is /tmp).
On any system, there can be more than one database server. Each database server listens on exactly one “well-known” network port , which cannot be shared between servers on the same system. The default port number for PostgreSQL is 5432, which has been registered with IANA, and is uniquely assigned to PostgreSQL. (You can see it used in the /etc/services file on most *nix servers). The port number can be used to uniquely identify a specific database server if many exist.
A database server is also sometimes known as a “database cluster”, because the PostgreSQL server allows you to define one or more databases on each server. Each connection request must identify exactly one database identifi ed by its dbname. When you connect, you will only be able to see database objects created within that database.
A database user is used to identify the connection. By default, there is no limit on the number of connections for a particular user. In more recent versions of PostgreSQL, users are referred to as login roles, though many clues remind us of the earlier naming, and it still makes sense in many ways. A login role is a role that has been assigned the CONNECT privilege.
Each connection will typically be authenticated in some way. This is defined at the server, so is not optional at connection time if the administrator has confi gured the server to require authentication.
Once you’ve connected, each connection can have one active transaction at a time and one fully active statement at any time.
The server will have a defined limit on the number of connections it can serve, so a connection request can be refused if the server is oversubscribed.
Inspecting your connection information
If you want to confirm you’ve connected to the right place and in the right way, you can execute some or all of the following commands:
This shows the port on which the server is listening.
Shows the current database.
This shows the current userid.
Shows the IP address of the server that accepted the connection.
A user’s password is not accessible using general SQL for obvious reasons.
You may also need the following:
There are many other snippets of information required to understand connections. Some of those are mentioned in this article. For further details, please consult the PostgreSQL server documentation.