(For more resources on Phython, see here.)
The purpose of the archiving methods covered in this article is to allow you, as the developer, to back up databases that you use for your work without having to rely on the database administrator. As noted later in the article, there are more sophisticated methods for backups than we cover here, but they involve system-administrative tasks that are beyond the remit of any development post and are thus beyond the scope of this article.
Every database needs a backup plan
When archiving a database, one of the critical questions that must be answered is how to take a snapshot backup of the database without having users change the data in the process. If data changes in the midst of the backup, it results in an inconsistent backup and compromises the integrity of the archive. There are two strategic determinants for backing up a database system:
- Offline backups
- Live backups
Which you use depends on the dynamics of the system in question and the import of the data being stored. In this article, we will look at each in turn and the way to implement them.
Offline backups are done by shutting down the server so the records can be archived without the fear of them being changed by the user. It also helps to ensure the server shut down gracefully and that errors were avoided. The problem with using this method on most production systems is that it necessitates a temporary loss of access to the service. For most service providers, such a consequence is anathema to the business model.
The value of this method is that one can be certain that the database has not changed at all while the backup is run. Further, in many cases, the backup is performed faster because the processor is not simultaneously serving data. For this reason, offline backups are usually performed in controlled environments or in situations where disruption is not critical to the user. These include internal databases, where administrators can inform all users about the disruption ahead of time, and small business websites that do not receive a lot of traffic.
Offline backups also have the benefit that the backup is usually held in a single file. This can then be used to copy a database across hosts with relative ease.
Shutting down a server obviously requires system administrator-like authority. So creating an offline backup relies on the system administrator shutting down the server. If your responsibilities include database administration, you will also have sufficient permission to shut down the server.
Live backups occur while the server continues to accept queries from users, while it’s still online. It functions by locking down the tables so no new data may be written to them. Users usually do not lose access to the data and the integrity of the archive, for a particular point in time is assured.
Live backups are used by large, data-intensive sites such as Nokia’s Ovi services and Google’s web services. However, because they do not always require administrator access of the server itself, these tend to suit the backup needs of a development project.
Choosing a backup method
After having determined whether a database can be stopped for the backup, a developer can choose from three methods of archiving:
- Copying the data files (including administrative files such as logs and tablespaces)
- Exporting delimited text files
- Backing up with command-line programs
Which you choose depends on what permissions you have on the server and how you are accessing the data.
MySQL also allows for two other forms of backup: using the binary log and by setting up replication (using the master and slave servers). To be sure, these are the best ways to back up a MySQL database. But, both of these are administrative tasks and require system-administrator authority; they are not typically available to a developer. However, you can read more about them in the MySQL documentation. Use of the binary log for incremental backups is documented at:
Setting up replication is further dealt with at:
Copying the table files
The most direct way to back up database files is to copy from where MySQL stores the database itself. This will naturally vary based on platform. If you are unsure about which directory holds the MySQL database files, you can query MySQL itself to check:
mysql> SHOW VARIABLES LIKE 'datadir';
Alternatively, the following shell command sequence will give you the same information:
$ mysqladmin variables | grep datadir
| datadir | /var/lib/mysql/ |
Note that the location of administrative files, such as binary logs and InnoDB tablespaces are customizable and may not be in the data directory.
If you do not have direct access to the MySQL server, you can also write a simple Python program to get the information:
mydb = MySQLdb.connect('<hostname>',
cursor = mydb.cursor()
runit = cursor.execute("SHOW VARIABLES LIKE 'datadir'")
results = cursor.fetchall()
print "%s: %s" %(cursor.fetchone())
Slight alteration of this program will also allow you to query several servers automatically. Simply change the login details and adapt the output to clarify which data is associated with which results.
Locking and flushing
If you are backing up an offline MyISAM system, you can copy any of the files once the server has been stopped. Before backing up a live system, however, you must lock the tables and flush the log files in order to get a consistent backup at a specific point. These tasks are handled by the LOCK TABLES and FLUSH commands respectively. When you use MySQL and its ancillary programs (such as mysqldump) to perform a backup, these tasks are performed automatically. When copying files directly, you must ensure both are done. How you apply them depends on whether you are backing up an entire database or a single table.
The LOCK TABLES command secures a specified table in a designated way. Tables can be referenced with aliases using AS and can be locked for reading or writing. For our purposes, we need only a read lock to create a backup. The syntax looks like this:
LOCK TABLES <tablename> READ;
This command requires two privileges: LOCK TABLES and SELECT.
It must be noted that LOCK TABLES does not lock all tables in a database but only one. This is useful for performing smaller backups that will not interrupt services or put too severe a strain on the server. However, unless you automate the process, manually locking and unlocking tables as you back up data can be ridiculously inefficient.
The FLUSH command is used to reset MySQL’s caches. By re-initiating the cache at the point of backup, we get a clear point of demarcation for the database backup both in the database itself and in the logs. The basic syntax is straightforward, as follows:
FLUSH <the object to be reset>;
Use of FLUSH presupposes the RELOAD privilege for all relevant databases. What we reload depends on the process we are performing. For the purpose of backing up, we will always be flushing tables:
How we “flush” the tables will depend on whether we have already used the LOCK TABLES command to lock the table. If we have already locked a given table, we can call FLUSH for that specific table:
FLUSH TABLES <tablename>;
However, if we want to copy an entire database, we can bypass the LOCK TABLES command by incorporating the same call into FLUSH:
FLUSH TABLES WITH READ LOCK;
This use of FLUSH applies across the database, and all tables will be subject to the read lock. If the account accessing the database does not have sufficient privileges for all databases, an error will be thrown.
Unlocking the tables
Once you have copied the files for a backup, you need to remove the read lock you imposed earlier. This is done by releasing all locks for the current session:
Restoring the data
Restoring copies of the actual storage files is as simple as copying them back into place. This is best done when MySQL has stopped, lest you risk corruption. Similarly, if you have a separate MySQL server and want to transfer a database, you simply need to copy the directory structure from the one server to another. On restarting, MySQL will see the new database and treat it as if it had been created natively. When restoring the original data files, it is critical to ensure the permissions on the files and directories are appropriate and match those of the other MySQL databases.
Delimited backups within MySQL
MySQL allows for exporting of data from the MySQL command line. To do so, we simply direct the output from a SELECT statement to an output file.
Using SELECT INTO OUTFILE to export data
Using sakila, we can save the data from film to a file called film.data as follows:
SELECT * INTO OUTFILE 'film.data' FROM film;
This results in the data being written in a tab-delimited format. The file will be written to the directory in which MySQL stores the sakila data. Therefore, the account under which the SELECT statement is executed must have the FILE privilege for writing the file as well as login access on the server to view it or retrieve it. The OUTFILE option on SELECT can be used to write to any place on the server that MySQL has write permission to use. One simply needs to prepend that directory location to the file name. For example, to write the same file to the /tmp directory on a Unix system, use:
SELECT * INTO OUTFILE '/tmp/film.data' FROM film;
Windows simply requires adjustment of the directory structure accordingly.
Using LOAD DATA INFILE to import data
If you have an output file or similar tab-delimited file and want to load it into MySQL, use the LOAD DATA INFILE command. The basic syntax is:
LOAD DATA INFILE '<filename>' INTO TABLE <tablename>;
For example, to import the film.data file from the /tmp directory into another table called film2, we would issue this command:
LOAD DATA INFILE '/tmp/film.data' INTO TABLE film2;
Note that LOAD DATA INFILE presupposes the creation of the table into which the data is being loaded. In the preceding example, if film2 had not been created, we would receive an error. If you are trying to mirror a table, remember to use the SHOW CREATE TABLE query to save yourself time in formulating the CREATE statement.
This discussion only touches on how to use LOAD DATA INFILE for inputting data created with the OUTFILE option of SELECT. But, the command handles text files with just about any set of delimiters. To read more on how to use it for other file formats, see the MySQL documentation at: