In this article by Oskar Hane, author of the book, Build Your Own PaaS with Docker, we'll cover the following topics:

Data volumes

Creating a data volume image

Host on GitHub

Publishing on Docker Registry Hub

Running on Docker Registry Hub

Passing parameters to containers

Creating a parameterized image

(For more resources related to this topic, see here.)

Data volumes

There are two ways in which we can mount external volumes on our containers. A data volume lets you share data between containers, and the data inside the data volume is untouched if you update, stop, or even delete your service container.

A data volume is mounted with the –v option in the docker run statement:

docker run –v /host/dir:container/dir

You can add as many data volumes as you want to a container, simply by adding multiple –v directives.

A very good thing about data volumes is that the containers that get data volumes passed into them don't know about it, and don't need to know about it either. No changes are needed for the container; it works just as if it were writing to the local filesystem. You can override existing directories inside containers, which is a common thing to do. One usage of this is to have the web root (usually at /var/www inside the container) in a directory at the Docker host.

Mounting a host directory as a data volume

You can mount a directory (or file) from your host on your container:

docker run –d --name some-wordpress –v /home/web/wp-one:/var/www wordpress

This will mount the host's local directory, /home/web/wp-one, as /var/www on the container. If you want to give the container only the read permission, you can change the directive to –v /home/web/wp-one:/var/www:ro where the :ro is the read-only flag.

It's not very common to use a host directory as a data volume in production, since data in a directory isn't very portable. But it's very convenient when testing how your service container behaves when the source code changes.

Any change you make in the host directory is direct in the container's mounted data volume.

Mounting a data volume container

A more common way of handling data is to use a container whose only task is to hold data. The services running in the container should be as few as possible, thus keeping it as stable as possible.

Data volume containers can have exposed volumes via the Dockerfile's VOLUME keyword, and these volumes will be mounted on the service container while using the data volume container with the --volumes-from directive.

A very simple Dockerfile with a VOLUME directive can look like this:

FROM ubuntu:latest
VOLUME ["/var/www"]

A container using the preceding Dockerfile will mount /var/www. To mount the volumes from a data container onto a service container, we create the data container and then mount it, as follows:

docker run –d --name data-container our-data-container
docker run –d --name some-wordpress --volumes-from data-container wordpress

Backup and restore data volumes

Since the data in a data volume is shared between containers, it's easy to access the data by mounting it onto a temporary container. Here's how you can create a .zip file (from your host) from the data inside a data volume container that has VOLUME ["/var/www"] in its Dockerfile:

docker run --volumes-from data-container -v $(pwd):/host ubuntu zip -r /host/data-containers-www /var/www

This creates a .zip file named data-containers-www.zip, containing what was in the. www data container from var directory. This .zip file places that content in your current host directory.

Creating a data volume image

Since our data volume container will just hold our data, we should keep it as small as possible to start with so that it doesn't take lots of unnecessary space on the server. The data inside the container can, of course, grow to be as big as the space on the server's disk. We don't need anything fancy at all; we just need a working file storage system.

For this article, we'll keep all our data (MySQL database files and WordPress files) in the same container. You can, of course, separate them into two data volume containers named something like dbdata and webdata.

Data volume image

Our data volume image does not need anything other than a working filesystem that we can read and write to. That's why our base image of choice will be BusyBox. This is how BusyBox describes itself:

"BusyBox combines tiny versions of many common UNIX utilities into a single small executable. It provides replacements for most of the utilities you usually find in GNU fileutils, shellutils, etc. The utilities in BusyBox generally have fewer options than their full-featured GNU cousins; however, the options that are included provide the expected functionality and behave very much like their GNU counterparts. BusyBox provides a fairly complete environment for any small or embedded system."

That sounds great! We'll go ahead and add this to our Dockerfile:

FROM busybox:latest

Exposing mount points

There is a VOLUME instruction for the Dockerfile, where you can define which directories to expose to other containers when this data volume container is added using --volumes-from attribute. In our data volume containers, we first need to add a directory for MySQL data. Let's take a look inside the MySQL image we will be using to see which directory is used for the data storage, and expose that directory to our data volume container so that we can own it:

RUN mkdir –p /var/lib/mysql
VOLUME ["/var/lib/mysql"]

We also want our WordPress installation in this container, including all .php files and graphic images. Once again, we go to the image we will be using and find out which directory will be used. In this case, it's /var/www/html. When you add this to the Dockerfile, don't add new lines; just append the lines with the MySQL data directory:

RUN mkdir -p /var/lib/mysql && mkdir -p /var/www/html
VOLUME ["/var/lib/mysql", "/var/www/html"]

The Dockerfile

The following is a simple Dockerfile for the data image:

FROM busybox:latest
MAINTAINER Oskar Hane <oh@oskarhane.com>
RUN mkdir -p /var/lib/mysql && mkdir -p /var/www/html
VOLUME ["/var/lib/mysql", "/var/www/html"]

And that's it! When publishing images to the Docker Registry Hub, it's good to include a MAINTAINER instruction in the Dockerfiles so that you can be contacted if someone wants, for some reason.

Host on GitHub

When we use our knowledge on how to host Docker image sources on GitHub and how to publish images on the Docker Registry Hub, it'll be no problem creating our data volume image.

Let's create a branch and a Dockerfile and add the content for our data volume image:

git checkout -b data
vi Dockerfile
git add Dockerfile

On line number 2 in the preceding code, you can use the text editor of your choice. I just happen to find vi suits my needs. The content you should add to the Dockerfile is this:

FROM busybox:latest
MAINTAINER Oskar Hane <oh@oskarhane.com>
RUN mkdir /var/lib/mysql && mkdir /var/www/html
VOLUME ["/var/lib/mysql", "/var/www/html"]

Replace the maintainer information with your name and e-mail.

You can—and should—always ensure that it works before committing and pushing to GitHub. To do so, you need to build a Docker image from your Dockerfile:

docker build –t data-test

Make sure you notice the dot at the end of the line, which means that Docker should look for a Dockerfile in the current directory. Docker will try to build an image from the instructions in our Dockerfile. It should be pretty fast, since it's a small base image and there's nothing but a couple of VOLUME instructions on top of it.

The screenshot is as follows:

giving-containers-data-and-parameters-img-0

When everything works as we want, it's time to commit the changes and push it to our GitHub repository:

git commit –m "Dockerfile for data volume added."
git push origin data

When you have pushed it to the repository, head over to GitHub to verify that your new branch is present there.

The following screenshot shows the GitHub repository:

giving-containers-data-and-parameters-img-1

Publishing on Docker Hub Registry

Now that we have our new branch on GitHub, we can go to the Docker Hub Registry and create a new automated build, named data. It will have our GitHub data branch as source.

giving-containers-data-and-parameters-img-2

Wait for the build to finish, and then try to pull the image with your Docker daemon to verify that it's there and it's working.

The screenshot will be as follows:

Amazing! Check out the size of the image; it's just less than 2.5 MB. This is perfect since we just want to store data in it. A container on top of this image can, of course, be as big as your hard drive allows. This is just to show how big the image is. The image is read-only, remember?

Running a data volume container

Data volume containers are special; they can be stopped and still fulfill their purpose. Personally, I like to see all containers in use when executing docker ps command, since I like to delete stopped containers once in a while.

This is totally up to you. If you're okay with keeping the container stopped, you can start it using this command:

docker run –d oskarhane/data true

The true argument is just there to enter a valid command, and the –d argument places the container in detached mode, running in the background.

If you want to keep the container running, you need to place a service in the foreground, like this:

docker run –d oskarhane/data tail –f /dev/null

giving-containers-data-and-parameters-img-4

The tail –f /dev/null command is a command that never ends, so the container will be running until we stop it. Resource-wise, the tail command is pretty harmless.

Passing parameters to containers

We have seen how to give containers parameters or environment variables when starting the official MySQL container:

docker run --name mysql-one -e MYSQL_ROOT_PASSWORD=pw -d mysql

The –e MYSQL_ROOT_PASSWORD=pw command is an example showing how you can do it. It means that the MYSQL_ROOT_PASSWORD environment variable inside the container has pw as the value.

This is a very convenient way to have configurable containers where you can have a setup script as ENTRYPOINT or a foreground script configuring passwords; hosts; test, staging, or production environments; and other settings that the container needs.

Creating a parameterized image

Just to get the hang of this feature, which is very good, let's create a small Docker image that converts a string to uppercase or lowercase, depending on the state of
an environment variable.

The Docker image will be based on the latest Debian distribution and will have only an ENTRYPOINT command. This is the Dockerfile:

FROM debian:latest
ADD ./case.sh /root/case.sh
RUN chmod +x /root/case.sh
ENTRYPOINT /root/case.sh

This takes the case.sh file from our current directory, adds it to the container, makes it executable, and assigns it as ENTRYPOINT.

The case.sh file may look something like this:

#!/bin/bash
 
if [ -z "$STR" ]; then
       echo "No STR string specified."
       exit 0
fi
 
if [ -z "$TO_CASE" ]; then
       echo "No TO_CASE specified."
       exit 0
fi
 
if [ "$TO_CASE" = "upper" ]; then
       echo "${STR^^*}"
       exit 0
fi
if [ "$TO_CASE" = "lower" ]; then
       echo "${STR,,*}"
       exit 0
fi
echo "TO_CASE was not upper or lower"

This file checks whether the $STR and $TO_CASE environment variables are set. If the check on whether $TO_CASE is upper or lower is done and if that fails, an error message saying that we only handle upper and lower is displayed.

If $TO_STR was set to upper or lower, the content of the environment variable $STR is transformed to uppercase or lowercase respectively, and then printed to stdout.

Let's try this!

giving-containers-data-and-parameters-img-5

Here are some commands we can try:

docker run –i case
docker run –i case –e STR="My String" case
docker run –i case –e STR="My String" –e TO_CASE=camel case
docker run –i case –e STR="My String" –e TO_CASE=upper case
docker run –i case –e STR="My String" –e TO_CASE=lower case

This seems to be working as expected, at least for this purpose. Now we have created a container that takes parameters and acts upon them.

Summary

In this article, you learned that you can keep your data out of your service containers using data volumes. Data volumes can be any one of directories, files from the host's filesystem, or data volume containers.

We explored how we can pass parameters to containers and how to read them from inside ENTRYPOINT. Parameters are a great way to configure containers, making it easier to create more generalized Docker images.

We created a data volume container and published it to the Docker Hub Registry.