Docker II - Serving multiple docker hosts (multi-plexing)

Thu 03 November 2016

The Basics

There's a previous article you may want to read, "Docker I" about how to service a single, simple website - this article will assume that you can deploy a simple site container in isolation. Here we're going to cover a setup for serving different virtual hosts from various docker files, and automation of such a setup.

The Docker host may be an individual machine or a Docker swarm (will be referred to henceforth as '[Docker] host')

Our solution here will be to use HAProxy (high availabilty proxy) which is basically designed as a software load balancer to multiplex the different docker containers. In my view it is simpler and possibly faster/stabler than using NGINX which is a fully-fledged webserver, HAProxy was born to do this sort of thing.

Prerequisite:

You should be familiar with how to create some simple Docker image containers that run a website as per part I.
An understanding of basic HTTP binding, and what a port forwarding is (SSL basics might be handy to know too).

Know that to elicit the desired results, the docker containers must have:

Been built with exposed ports (usually Expose 80 is sufficient as we will assume SSL if any will be terminated with HAProxy), this example covers only port 80/HTTP.
At run-time define the environment variable VIRTUAL_HOST -e VIRTUAL_HOST=mysite.com to be the domain you expect the site to come in on (forget about the www subdomain for now, using hdr_end inside the haproxy for this example will negate the need to handle it, I'll update my tools and article later).

Rationale

Why would it be useful to have a universal and automated?

Universal: Being able to deploy any docker image to any host without any configuration changes -> So here it must know nothing about the outside world and should assume itself to be the default site, ergo each container's NGINX (in our case) is the default site with plain HTTP on port 80.
Automated: To be able to have up-scalable and/or rapid deployment, we do not want to spend our time adjusting and checking configs when we add/remove/update a site.

One thing to rule them all, and bind them

As you would be aware, only one process can bind to HTTP 80 (or 443), and to keep things amazingly simple, docker containers that host website typically just assume themselves to be the only site, and will, by default bind to port 80 (or at least what they think is port 80 - which we'll get to).

Here's what we want to be able to do:

[Optional] Build docker container from a GIT source.
Install the docker container into one or more selected Docker hosts.
Allow the host machine to start serving the docker site
Activate any DNS records required to publicly serve the site from this docker host.

In this article we will cover 2 & 3.

I will cover 1 in another article about CI integration with docker (for now you can just use images you've built manually).
I will either cover 4 as a separate article on using CloudFlare's API or extend this one by revision (for now just assign your nameservers manually).

The Setup

Let's start with an assumed three virtual hosts (websites) in this example that I'll carry through. You could already have them, or build them now for the purposes of trial, they could even be the same, though it might help to have a different content in each (you could start with the same image and edit the /var/www/index.html with vi).

Launching

If you want to just dive right in, grab the official nginx docker image Here and either use or change the default index.html of each.

If you use the standard NGINX image, assuming to NGINX have daemon off and implicitly start nginx process without specification.

[root@syd2 ~]# docker run -d -P -e VIRTUAL_HOST=site-a.com nginx
[root@syd2 ~]# docker run -d -P -e VIRTUAL_HOST=site-b.com nginx
[root@syd2 ~]# docker run -d -P -e VIRTUAL_HOST=site-b.com nginx

If you did use the standard image without any custom content you may want to serve a slightly different default page out of each so as to differentiate you're being served the expect virtual host. For the extra lazy, use the folowing three HTML files. Container 1 Container 2 Container 3 installation can be done as follows:

docker exec -it <CONTAINER> /usr/bin/wget http://blog.mitchellcurrie.com -O /var/www/html/index.html

Now what

We've got 3 hosts running and serving websites, if you run docker ps you can confirm this and see the ports that they are accessible on.

[root@syd2 ~]# docker ps
CONTAINER ID        IMAGE     COMMAND         CREATED             STATUS              PORTS                           NAMES
e89e2d472e07        nginx     "nginx"         20 hours ago        Up 20 hours         0.0.0.0:32774->80/tcp           romantic_sammet        
73c493efd603        nginx     "nginx"         23 hours ago        Up 23 hours         0.0.0.0:2082->80/tcp            compassionate_fermat   
9c04cb95f3b9        nginx     "nginx"         27 hours ago        Up 27 hours         0.0.0.0:32768->80/tcp           elegant_turing

Well great! 0.0.0.0:32774->80/tcp indicates that the first container maps the host port 32774 to the contaner's bound port 80 (e.g. NGINX). Navigate to http://host_address:32774 in this example to see the first container, and so and and so forth for the others.

This is promising, we've got all the containers able to be accessed from the outside world. The only problem is they need to use an IP address and a specific port. At this stage even if you correctly configure your name servers to point to the host you'll still need to specify the external port that docker maps.

At this point, we'll remember that only one process can bind to port 80 - so it can't be an individual container. We could use NGINX on the host, but NGINX exposes alot, and should be run in its own container, which makes things trickier.

Enter HAProxy.

HAProxy is designed as a load balancer, which is great because that's precisely the type of behaviour we are requiring - the multiplexing of one or more sites to one or more endpoints. For us right now we will assume a group of sites each mapping to exactly one endpoint. You could go ahead and install this right now and configure it by hand, the configuration required would be like:

global
    daemon
    maxconn 4096

defaults
    mode http
    timeout connect 5000ms
    timeout client 50000ms
    timeout server 50000ms

frontend http-in
    bind *:80
    acl is_site-a.com hdr_end(host) -i site-a.com
    acl is_site-b.com hdr_end(host) -i site-b.com
    acl is_site-c.com hdr_end(host) -i site-c.com

    use_backend site_site-a.com if is_site-a.com
    use_backend site_site-b.com if is_site-b.com
    use_backend site_site-c.com if is_site-c.com

backend site_site-a.com
    balance roundrobin
    option httpclose
    option forwardfor
    server se89e2d472e075b77101ef32f7ddf3027a7e0c9baf669c8f4a711cc57de8d9777 127.0.0.1:32774 maxconn 32

backend site_site-b.com
    balance roundrobin
    option httpclose
    option forwardfor
    server s73c493efd603456a906d9f4e47765fba1b45745f258d0550d97e434665520e31 127.0.0.1:2082 maxconn 32

backend site_site-c.com
    balance roundrobin
    option httpclose
    option forwardfor
    server s9c04cb95f3b9cd4093698bf83b73d8de6208c661352107c5193b57f33005a95f 127.0.0.1:32768 maxconn 32

listen admin
    bind 127.0.0.1:8080
    stats enable

After configuring haproxy.cfg, type service reload haproxy and try navigating to all of the domains listed in turn (in our example site-a.com, site-b.com, site-c.com). If everything went according to plan you should be served with the respective content of those containers (which is why I suggested they differ in some way!).

Visualising it

It's great that it works, let's recap over what's going on here.

Consider our example with a Docker host with 3 containers that predictably match to 3 different websites (virtual hosts). Let's pretend that some of them are running different application servers under NGINX for exampe's sake, the mapping would look something like this:

graph TB
    Client1 -->|SiteA| HAProxy
    Client2 -->|SiteB| HAProxy
    Client3 -->|SiteC| HAProxy
    HAProxy -->|Port A| ContainerA
    HAProxy -->|Port B| ContainerB
    HAProxy -->|Port C| ContainerC
    ContainerA -->|80/HTTP| A_Nginx["A.Nginx"]
    ContainerB -->|80/HTTP| B_Nginx["B.Nginx"]
    ContainerC -->|80/HTTP| C_Nginx["C.Nginx"]
    B_Nginx -->|socket| B_UWSGI["B.UWSGI"]
    C_Nginx -->|socket| C_PHP_FPM["C.PHP-FPM"]

As you can see according to the graph, all of the domain names point to the same docker host, and with HAProxy listening it forwards the request to the appropriate port depending on which domain was specified - and the docker host's nginx/cgi software takes over from there and the request is returned to the client seamlessly.

Automating it

That could be alot of work depending on how many containers you have for your various sites, and even a pain if those sites get redeployed. I'll propose and give a working utility to automatically give this behaviour to us. For now just see how the single frontend maps to multiple backends based on the hostname and the matching port as listed above in Docker.

See my GitHub repository here for full details but in short the config can be fully written based on running containers and changes made live by executing:

sudo python3 hapconf.py --auto-write /etc/haproxy/haproxy.cfg && sudo service haproxy reload

Closing Thoughts

As you can see, getting HAProxy to multiplex our various docker containers wasn't too difficult, and using the script we can reload the config anytime we wish with minimal effort and risk of human error.

What wasn't covered here was SSL termination, which could be done in either the host or the container, for the sake of ease and because its probably just fine for most people, I recommend SSL termination inside HAProxy, which means it'll be configured with the certificates and the containers inside will need and have zero knowledge of any SSL, meaning they can be deployed anywhere without change and the host has the SSL termination configured.