In this post, we will investigate one of the more complex topics when working with Docker – networking.
We have already seen in the previous post that namespaces can be used to isolate networking resources used by different containers as well as resources used by containers from those used by the host system. However, by nature, networking is not only about isolating but also about connecting – how does this work?
So let us do some tests with the httpd:alpine image. First, get a copy of that image into your local repository:
$ docker pull httpd:alpine
Then open two terminals that we call Container1 and Container2 and attach to them (note that you need to switch to the second terminal after entering the second line).
$ docker run --rm -d --name="Container1" httpd:alpine $ docker exec -it Container1 "/bin/sh" $ docker run --rm -d --name="Container2" httpd:alpine $ docker exec -it Container2 "/bin/sh"
Next, make curl available in both containers using apk update
followed by apk add curl
. Now let us switch to cointainer 1 and inspect the network configuration.
/usr/local/apache2 # ifconfig eth0 Link encap:Ethernet HWaddr 02:42:AC:11:00:02 inet addr:172.17.0.2 Bcast:172.17.255.255 Mask:255.255.0.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:1867 errors:0 dropped:0 overruns:0 frame:0 TX packets:1017 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:1831738 (1.7 MiB) TX bytes:68475 (66.8 KiB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
We see that there is an ethernet interface eth0 with IP address 172.17.0.2. If you do the same in the second container, you should see a similar output with a different IP address, say 172.17.0.3.
Now let us make a few connection tests. First, go to the terminal running inside container 1 and enter
curl 172.17.0.3
You should now see a short HTML snippet containing the text “It works!”. So apparently we can reach container 2 from container 1 – and similarly, you should be able to reach container 1 from container 2. Finally, try the same from a terminal attached to the host – you should be able to reach both containers from there. Finally, if you also have a web server or a similar server running on the host, you will see that you can also reach that from within the containers. In my case, I have a running tomcat being bound to 0.0.0.0:8080 on my local host, and was able to connect using
curl 192.168.178.27:8080
from within the container. How does this work?
To solve the puzzle, go back to a terminal on the host system and take a look at the routing table.
$ ip route show default via 192.168.178.1 dev enp4s0 proto static metric 100 169.254.0.0/16 dev enp4s0 scope link metric 1000 172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 192.168.178.0/24 dev enp4s0 proto kernel scope link src 192.168.178.27 metric 100
We see that docker has apparently added an additional routing table entry and has created an additional networking device – docker0 – to which all packages with destination in the class B network 172.17.0.0 are sent.
This device is a so called bridge. A (software) bridge is very similar to an ethernet bridge in hardware. It connects two or more devices – each packet that goes to one device is forwarded to all other devices connected to the bridge. The Linux kernel offers the option to establish a bridge in software which does exactly the same thing.
Let us list all existing bridges using the brctl
utility.
$ brctl show bridge name bridge id STP enabled interfaces docker0 8000.0242fd67b17b no veth1f8d78c vetha1692e1
Let us compare this with the output of the ifconfig
command (again on the host). The corresponding output is
veth1f8d78c Link encap:Ethernet HWaddr 56:e5:92:e8:77:1e inet6 addr: fe80::54e5:92ff:fee8:771e/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:1034 errors:0 dropped:0 overruns:0 frame:0 TX packets:1964 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:70072 (70.0 KB) TX bytes:1852757 (1.8 MB) vetha1692e1 Link encap:Ethernet HWaddr ca:73:3e:20:36:f7 inet6 addr: fe80::c873:3eff:fe20:36f7/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:1133 errors:0 dropped:0 overruns:0 frame:0 TX packets:2033 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:78715 (78.7 KB) TX bytes:1860499 (1.8 MB)
These two devices are also appearing in the output of brctl show
and are two devices that are connected to the bridge. These devices are called a virtual ethernet devices. They are always created in pairs and act like a pipe: traffic flowing in at one of the two devices appears to come out of the second device and vice versa – like a virtual network cable connecting the two devices.
We just said that virtual devices are always created in pairs. We have two containers, and if you start them one by one and look at the output of ifconfig, we see that each of the two containers contributes one device. Can that be correct? After starting the first container we should already see a pair, and after starting the second one we should see four instead of just two devices. So one in each pair is missing. Where did it go?
The answer is that Docker did create it, but move it into the namespace of the container, so that it is no longer visible on the host. To verify this, we can see the connection between the two interfaces as follows. First, enter
ip link
on the host. The output should look similar to the following lines
$ ip link 1: lo: mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: enp4s0: mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000 link/ether 1c:6f:65:c0:c9:85 brd ff:ff:ff:ff:ff:ff 3: docker0: mtu 1500 qdisc noqueue state UP mode DEFAULT group default link/ether 02:42:fd:67:b1:7b brd ff:ff:ff:ff:ff:ff 25: vetha1692e1@if24: mtu 1500 qdisc noqueue master docker0 state UP mode DEFAULT group default link/ether ca:73:3e:20:36:f7 brd ff:ff:ff:ff:ff:ff link-netnsid 0 27: veth1f8d78c@if26: mtu 1500 qdisc noqueue master docker0 state UP mode DEFAULT group default link/ether 56:e5:92:e8:77:1e brd ff:ff:ff:ff:ff:ff link-netnsid 1
This will show you (the very first number in each line) the so called ifindex which is a unique identifier for each network device within the kernel. In our case, the virtual ethernet devices visible in the host namespace have the indices 25 and 27. After each name, after the “at” symbol, you see a second number – 24 and 26. Now let us execute the same commmand in the first container.
/usr/local/apache2 # ip link 1: lo: mtu 65536 qdisc noqueue state UNKNOWN qlen 1 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 24: eth0@if25: mtu 1500 qdisc noqueue state UP link/ether 02:42:ac:11:00:02 brd ff:ff:ff:ff:ff:ff
Here suddenly the device with index 24 appears, and we see that it is connected to device if25, which is displayed as vetha1692e1 in the host namespace! Similarly, we find the device if26 inside container two:
/usr/local/apache2 # ip link 1: lo: mtu 65536 qdisc noqueue state UNKNOWN qlen 1 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 26: eth0@if27: mtu 1500 qdisc noqueue state UP link/ether 02:42:ac:11:00:03 brd ff:ff:ff:ff:ff:ff
This gives us now a rather complete picture of the involved devices. Ignoring the loopback devices for a moment, the following picture emerges.
Now we can understand what happens when an application in container 1 sends a packet to container 2. First, the kernel will inspect the routing table in container 1. This table looks as follows.
/usr/local/apache2 # ip route default via 172.17.0.1 dev eth0 172.17.0.0/16 dev eth0 src 172.17.0.2
So the kernel will determine that the packet should be sent to the interface known as eth0 in container 1 – this is the interface with the unique index 24. As this is part of a virtual ethernet device pair, it appears on the other side of the pair, i.e. at the device vetha1692e1. This device in turn is connected to the bridge docker0. Being a bridge, it will distribute the packet to all other attached devices, so it will reach veth1f8d78c. This is now one endpoint of the second virtual ethernet device pair, and so the packet will finally end up at the interface with the unique index 26, i.e. the interface that is called eth0 in container 2. On this interface, the HTTP daemon is listening, receives the message, prepares an answer and that answer goes the same way back to container 1.
Thus, effectively, it appears from inside the containers as if all container network interfaces would be attached to the same network segment. To complete the picture, we can actually follow the trace of a packet going from container 1 to container 2 using arp and traceroute.
/usr/local/apache2 # arp ? (172.17.0.1) at 02:42:fd:67:b1:7b [ether] on eth0 ? (172.17.0.3) at 02:42:ac:11:00:03 [ether] on eth0 /usr/local/apache2 # traceroute 172.17.0.3 traceroute to 172.17.0.3 (172.17.0.3), 30 hops max, 46 byte packets 1 172.17.0.3 (172.17.0.3) 0.016 ms 0.013 ms 0.010 ms
We can now understand how applications in different containers can communicate with each other. However, what we have discussed so far is not yet sufficient to explain how an application on the host can access a HTTP server running inside a container, and what additional setup we need to access a HTTP server running in a container from the LAN. This will be the topic of my next post.