5 min read

In April 2014 the first Dockercon took place to a packed house. It became clear that Docker has the right recipe to become a game changer, but one thing was missing: orchestration.  Many companies were attempting to answer the question “How do I run hundreds or thousands of containers cross my infrastructure”?

A number of solutions emerged that week: Kubernetes from Google, Geard from Red Hat, fleet from CoreOS, deis, flynn, ad infinitum. Even today there are well over 20 open source solutions for this problem, but one has emerged as an early leader: Kubernetes (kubernetes.io). Besides being built by Google, it has a few features that make it the most interesting solution: Pods, Labels and Services. We’ll review these features in this blog.

Along with the entire Docker ecosystem, Kubernetes is written in Go, open source and under heavy development. As of today, it can be deployed on GCE, Rackspace, VMware, Azure, AWS, Digital Ocean, Vagrant and others with scripts located in the official repository (https://github.com/GoogleCloudPlatform/kubernetes/tree/master/cluster). Deploying Kubernetes is generally done via SaltStack but there are a number deployment options for CoreOS as well.

Kubernetes Paradigms

Let’s take a look at pods, labels and services.

Pods

Pods are the primary unit that Kubernetes will schedule into your cluster. A pod may consist of 1 or more containers. If you define more than 1 container they are guaranteed to be co-located on a system allowing you to share local volumes and networking between the containers.

Here is an example or a pod definition with 1 container running a website, presumably with an application already in the image:

(These specs are from the original API, which is under heavy development and will change.)

<code - json>
{
  "id": “mysite",
  "kind": "Pod",
  "apiVersion": "v1beta1",
  "desiredState": {
    "manifest": {
      "version": "v1beta1",
      "id": “mysite",
      "containers": [{
        "name": “mysite",
        "image": “user/mysite",
        "cpu": 100,
        "ports": [{
          "containerPort": 80
        }]
      }]
    }
  },
  "labels": {
    "name": “mysite"
  }
}
</code - json>

In reality you probably want more than 1 of these containers running in case of a node failure or to help with load. This is the where the ReplicationController paradigm comes in. It allows a user to run multiple replicas of the same pod. Data is not shared between replicas but instead allows for many instances of a pod to be scheduled in the cluster.

<code - json>
{
  "id": “mysiteController",
  "kind": "ReplicationController",
  "apiVersion": "v1beta1",
  "desiredState": {
    "replicas": 2,
    "replicaSelector": {"name": “mysite"},
    "podTemplate": {
      "desiredState": {
         "manifest": {
           "version": "v1beta1",
           "id": “mysiteController",
           "containers": [{
             "name": “mysite",
             "image": “user/mysite",
             "cpu": 100,
             "ports": [{"containerPort": 80}]
           }]
         }
       },
       "labels": {"name": “mysite"}
      }},
  "labels": {"name": “mysite"}
}
</code – json>

In the above template we took the same pod but converted it to a ReplicationController. The “replicas” directive says that we want 2 of these pods running all of the time. Increasing the number of containers is as simple as raising the replica value.

Labels

Conceptually, labels are similar to standard metadata tags except that they are arbitrary key/value pairs. If want to label your pod “environment: staging” or “name: redis-slave” or both, go right ahead. Labels are primarily used by services to build powerful internal load balancing proxies, but can also be filter output from the API.

Services

Services are user-defined “load balancers” that are aware of the container locations and their labels. When a user creates a service, a proxy will be created on the Kubernetes nodes that will seamlessly proxy to any container that has the selected labels assigned.

<code - json>
{
  "id": “mysite",
  "kind": "Service",
  "apiVersion": "v1beta1",
  "port": 10000,
  "selector": {
    "name": “mysite"
  },
  "labels": {
    "name": “mysite"
  }
}
</code – json>

This is a basic example that creates a service that listens on port 10000, but will proxy to any pod that fulfills the “selector” requirements of “name: mysite”. If you have 1 container running, it will get all the traffic, if you have 3, they will each receive traffic. If you grow or shrink the number of containers, the proxies will be aware and balance accordingly.

Not all of these concepts are unique to Kubernetes, but it brings them together seamlessly. The future is also interesting for Kubernetes because it can act as a broker to the cloud provider for your containers. Need a static IP for a special pod? It could get that from the cloud provider. Need another server for additional resources? It could provision one and add it to the cluster.

Google Container Engine

It wasn’t a far stretch to see that if this project was successful then Google would run it as a service for their cloud. Indeed they’ve announced the Google Container Service based on Kubernetes (https://cloud.google.com/container-engine/). This also marks the first time Google has built a tool in the open and productized it. A successful product may mean that we see more day 1 open source projects from Google, which is certainly intriguing.

AWS and Docker Orchestration

Amazon announced their container orchestration service at re:invent. This blog wouldn’t be complete without a quick comparison between the two.

Amazon allows you to co-locate multiple Docker containers on a single host, but the similarities with Kubernetes stop there. Their container service is proprietary, which isn’t a surprise. They’re using links to connect containers on the same host, but there is no mention of smart proxies inside the system. There isn’t a lot of integration with the rest of the AWS services (i.e. load balancing) yet, but I expect that to change pretty quickly.

Summary

In this post, we touched on why Kubernetes exists, why it’s a unique leader in the pack, a bit on its paradigms and finally a quick comparison the AWS EC2 container service. The EC2 container service will get a lot of attention, but in my opinion Kubernetes is the Docker orchestration technology to beat right now, especially if you value open source. If you’re wondering which direction Docker is heading, make sure to keep an eye out for Docker Host and Docker Cluster. Lastly, I hope you recognize that we are at the beginning stages of a new deployment and operational paradigm that leverages lightweight containers. Expect this space to change and evolve rapidly.

For more Docker tutorials and even more insight and analysis, visit our dedicated Docker page – find it here.

About the author

Ryan Richard is a systems architect at Rackspace with a background in automation and OpenStack. His primary role revolves around research and development of new technologies. He added the initial support for the Rackspace Cloud into the Kubernetes codebase. He can be reached at: @rackninja on Twitter.

LEAVE A REPLY

Please enter your comment!
Please enter your name here