In this article by Sreenivas Makam, author of the book Mastering CoreOS explains how microservices has increased the need to have lots of containers and also connectivity between containers across hosts. It is necessary to have a robust Container networking scheme to achieve this goal. This article will cover the basics of Container networking with a focus on how CoreOS does Container networking with Flannel.
(For more resources related to this topic, see here.)
Container networking basics
The following are the reasons why we need Container networking:
- Containers need to talk to the external world.
- Containers should be reachable from the external world so that the external world can use the services that Containers provide.
- Containers need to talk to the host machine. An example can be sharing volumes.
- There should be inter-container connectivity in the same host and across hosts. An example is a WordPress container in one host talking to a MySQL container in another host.
Multiple solutions are currently available to interconnect Containers. These solutions are pretty new and actively under development. Docker, until release 1.8, did not have a native solution to interconnect Containers across hosts. Docker release 1.9 introduced a Libnetwork-based solution to interconnect containers across hosts as well as do service discovery. CoreOS is using Flannel for container networking in CoreOS clusters. There are projects such as Weave and Calico that are developing Container networking solutions, and they plan to be a networking container plugin for any Container runtime such as Docker or Rkt.
Flannel is an open source project that provides a Container networking solution for CoreOS clusters. Flannel can also be used for non-CoreOS clusters. Kubernetes uses Flannel to set up networking between the Kubernetes pods. Flannel allocates a separate subnet for every host where a Container runs, and the Containers in this host get allocated an individual IP address from the host subnet. An overlay network is set up between each host that allows Containers on different hosts to talk to each other. In Chapter 1, CoreOS Overview covered an overview of the Flannel control and data path. This section will delve into the Flannel internals.
Flannel can be installed manually or using the systemd unit, flanneld.service. The following command will install flannel in the CoreOS node using a container to build the flannel binary. The flanneld Flannel binary will be available in /home/core/flannel/bin after executing the following commands:
git clone https://github.com/coreos/flannel.git docker run -v /home/core/flannel:/opt/flannel -i -t google/golang /bin/bash -c "cd /opt/flannel && ./build"
The following is the Flannel version after we build flannel in our CoreOS node:
Installation using flanneld.service
Flannel is not installed by default in CoreOS. This is done to keep the CoreOS image size to a minimum. Docker requires flannel to configure the network and flannel requires docker to download the flannel container. To avoid this chicken-and-egg problem, early-docker.service is started by default in CoreOS, whose primary purpose is to download the flannel container and start it. A regular docker.service starts the Docker daemon with the flannel network.
The following image shows you the sequence in flanneld.service, where early Docker daemon starts the flannel container, which, in turn starts docker.service with the subnet created by flannel:
The following is the relevant section of flanneld.service that downloads the flannel container from the Quay repository:
The following output shows the early docker’s running containers. Early-docker will manage Flannel only:
The following is the relevant section of flanneld.service that updates the docker options to use the subnet created by flannel:
The following is the content of flannel_docker_opts.env—in my case—after flannel was started. The address, 10.1.60.1/24, is chosen by this CoreOS node for its containers:
Docker will be started as part of docker.service, as shown in the following image, with the preceding environment file:
There is no central controller in flannel, and it uses etcd for internode communication. Each node in the CoreOS cluster runs a flannel agent and they communicate with each other using etcd.
As part of starting the Flannel service, we specify the Flannel subnet that can be used by the individual nodes in the network. This subnet is registered with etcd so that every CoreOS node in the cluster can see it. Each node in the network picks a particular subnet range and registers atomically with etcd.
The following is the relevant section of cloud-config that starts flanneld.service along with specifying the configuration for Flannel. Here, we have specified the subnet to be used for flannel as 10.1.0.0/16 along with the encapsulation type as vxlan:
The preceding configuration will create the following etcd key as seen in the node. This shows that 10.1.0.0/16 is allocated for flannel to be used across the CoreOS cluster and that the encapsulation type is vxlan:
Once each node gets a subnet, containers started in this node will get an IP address from the IP address pool allocated to the node. The following is the etcd subnet allocation per node. As we can see, all the subnets are in the 10.1.0.0/16 range that was configured earlier with etcd and with a 24-bit mask. The subnet length per host can also be controlled as a flannel configuration option:
Let’s look at ifconfig of the Flannel interface created in this node. The IP address is in the address range of 10.1.0.0/16:
Flannel uses the Linux bridge to encapsulate the packets using an overlay protocol specified in the Flannel configuration. This allows for connectivity between containers in the same host as well as across hosts.
The following are the major backends currently supported by Flannel and specified in the JSON configuration file. The JSON configuration file can be specified in the Flannel section of cloud-config:
UDP: In UDP encapsulation, packets from containers are encapsulated in UDP with the default port number 8285. We can change the port number if needed.
VXLAN: From an encapsulation overhead perspective, VXLAN is efficient when compared to UDP. By default, port 8472 is used for VXLAN encapsulation. If we want to use an IANA-allocated VXLAN port, we need to specify the port field as 4789.
AWS-VPC: This is applicable to using Flannel in the AWS VPC cloud. Instead of encapsulating the packets using an overlay, this approach uses a VPC route table to communicate across containers. AWS limits each VPC route table entry to 50, so this can become a problem with bigger clusters.
The following is an example of specifying the AWS type in the flannel configuration:
GCE: This is applicable to using Flannel in the GCE cloud. Instead of encapsulating the packets using an overlay, this approach uses the GCE route table to communicate across containers. GCE limits each VPC route table entry to 100, so this can become a problem with bigger clusters.
The following is an example of specifying the GCE type in the Flannel configuration:
Let’s create containers in two different hosts with a VXLAN encapsulation and check whether the connectivity is fine. The following example uses a Vagrant CoreOS cluster with the Flannel service enabled.
Let’s start a busybox container:
Let’s check the IP address allotted to the container. This IP address comes from the IP pool allocated to this CoreOS node by the flannel agent. 10.1.19.0/24 was allocated to host 1 and this container got the 10.1.19.2 address:
Let’s start a busybox container:
Let’s check the IP address allotted to this container. This IP address comes from the IP pool allocated to this CoreOS node by the flannel agent. 10.1.1.0/24 was allocated to host 2 and this container got the 10.1.1.2 address:
The following output shows you the ping being successful between container 1 and container 2. This ping packet is travelling across the two CoreOS nodes and is encapsulated using VXLAN:
Flannel as a CNI plugin
As explained in Chapter 1, CoreOS Overview, APPC defines a Container specification that any Container runtime can use. For Container networking, APPC defines a Container Network Interface (CNI) specification. With CNI, the Container networking functionality can be implemented as a plugin. CNI expects plugins to support APIs with a set of parameters and the implementation is left to the plugin. Example APIs add a container to a network and remove the container from the network with a defined parameter list. This allows the implementation of network plugins by different vendors and also the reuse of plugins across different Container runtimes. The following image shows the relationship between the RKT container runtime, CNI layer, and Plugin like Flannel. The IPAM Plugin is used to allocate an IP address to the containers and this is nested inside the initial networking plugin:
In this chapter, we covered different Container networking technologies with a focus on Container networking in CoreOS. There are many companies trying to solve this Container networking problem.
Resources for Article:
- Network and Data Management for Containers [article]
- Deploying a Play application on CoreOS and Docker [article]
- CoreOS – Overview and Installation [article]