Docker II - Serving multiple docker hosts (multi-plexing)
The Basics
There's a previous article you may want to read, "Docker I" about how to service a single, simple website - this article will assume that you can deploy a simple site container in isolation. Here we're going to cover a setup for serving different virtual hosts from various docker files, and automation of such a setup.
The Docker host may be an individual machine or a Docker swarm (will be referred to henceforth as '[Docker] host')
Our solution here will be to use HAProxy (high availabilty proxy) which is basically designed as a software load balancer to multiplex the different docker containers. In my view it is simpler and possibly faster/stabler than using NGINX which is a fully-fledged webserver, HAProxy was born to do this sort of thing.
Prerequisite:
- You should be familiar with how to create some simple Docker image containers that run a website as per part I.
- An understanding of basic HTTP binding, and what a port forwarding is (SSL basics might be handy to know too).
Know that to elicit the desired results, the docker containers must have:
- Been built with exposed ports (usually Expose 80 is sufficient as we will assume SSL if any will be terminated with HAProxy), this example covers only port 80/HTTP.
- At run-time define the environment variable VIRTUAL_HOST
-e VIRTUAL_HOST=mysite.com
to be the domain you expect the site to come in on (forget about thewww
subdomain for now, usinghdr_end
inside the haproxy for this example will negate the need to handle it, I'll update my tools and article later).
Rationale
Why would it be useful to have a universal and automated?
- Universal: Being able to deploy any docker image to any host without any configuration changes -> So here it must know nothing about the outside world and should assume itself to be the default site, ergo each container's NGINX (in our case) is the default site with plain HTTP on port 80.
- Automated: To be able to have up-scalable and/or rapid deployment, we do not want to spend our time adjusting and checking configs when we add/remove/update a site.
One thing to rule them all, and bind them
As you would be aware, only one process can bind to HTTP 80 (or 443), and to keep things amazingly simple, docker containers that host website typically just assume themselves to be the only site, and will, by default bind to port 80 (or at least what they think is port 80 - which we'll get to).
Here's what we want to be able to do:
- [Optional] Build docker container from a GIT source.
- Install the docker container into one or more selected Docker hosts.
- Allow the host machine to start serving the docker site
- Activate any DNS records required to publicly serve the site from this docker host.
In this article we will cover 2 & 3.
- I will cover 1 in another article about CI integration with docker (for now you can just use images you've built manually).
- I will either cover 4 as a separate article on using CloudFlare's API or extend this one by revision (for now just assign your nameservers manually).
The Setup
Let's start with an assumed three virtual hosts (websites) in this example that I'll carry through.
You could already have them, or build them now for the purposes of trial, they could even be the same, though it might help to have a different content in each (you could start with the same image and edit the /var/www/index.html with vi
).
Launching
If you want to just dive right in, grab the official nginx docker image Here and either use or change the default index.html of each.
If you use the standard NGINX image, assuming to NGINX have daemon off and implicitly start nginx
process without specification.
[root@syd2 ~]# docker run -d -P -e VIRTUAL_HOST=site-a.com nginx
[root@syd2 ~]# docker run -d -P -e VIRTUAL_HOST=site-b.com nginx
[root@syd2 ~]# docker run -d -P -e VIRTUAL_HOST=site-b.com nginx
If you did use the standard image without any custom content you may want to serve a slightly different default page out of each so as to differentiate you're being served the expect virtual host. For the extra lazy, use the folowing three HTML files. Container 1 Container 2 Container 3 installation can be done as follows:
docker exec -it <CONTAINER> /usr/bin/wget http://blog.mitchellcurrie.com -O /var/www/html/index.html
Now what
We've got 3 hosts running and serving websites, if you run docker ps
you can confirm this and see the ports that they are accessible on.
[root@syd2 ~]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
e89e2d472e07 nginx "nginx" 20 hours ago Up 20 hours 0.0.0.0:32774->80/tcp romantic_sammet
73c493efd603 nginx "nginx" 23 hours ago Up 23 hours 0.0.0.0:2082->80/tcp compassionate_fermat
9c04cb95f3b9 nginx "nginx" 27 hours ago Up 27 hours 0.0.0.0:32768->80/tcp elegant_turing
Well great!
0.0.0.0:32774->80/tcp
indicates that the first container maps the host port 32774 to the contaner's bound port 80 (e.g. NGINX). Navigate to http://host_address:32774 in this example to see the first container, and so and and so forth for the others.
This is promising, we've got all the containers able to be accessed from the outside world. The only problem is they need to use an IP address and a specific port. At this stage even if you correctly configure your name servers to point to the host you'll still need to specify the external port that docker maps.
At this point, we'll remember that only one process can bind to port 80 - so it can't be an individual container. We could use NGINX on the host, but NGINX exposes alot, and should be run in its own container, which makes things trickier.
Enter HAProxy.
HAProxy is designed as a load balancer, which is great because that's precisely the type of behaviour we are requiring - the multiplexing of one or more sites to one or more endpoints. For us right now we will assume a group of sites each mapping to exactly one endpoint. You could go ahead and install this right now and configure it by hand, the configuration required would be like:
global
daemon
maxconn 4096
defaults
mode http
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms
frontend http-in
bind *:80
acl is_site-a.com hdr_end(host) -i site-a.com
acl is_site-b.com hdr_end(host) -i site-b.com
acl is_site-c.com hdr_end(host) -i site-c.com
use_backend site_site-a.com if is_site-a.com
use_backend site_site-b.com if is_site-b.com
use_backend site_site-c.com if is_site-c.com
backend site_site-a.com
balance roundrobin
option httpclose
option forwardfor
server se89e2d472e075b77101ef32f7ddf3027a7e0c9baf669c8f4a711cc57de8d9777 127.0.0.1:32774 maxconn 32
backend site_site-b.com
balance roundrobin
option httpclose
option forwardfor
server s73c493efd603456a906d9f4e47765fba1b45745f258d0550d97e434665520e31 127.0.0.1:2082 maxconn 32
backend site_site-c.com
balance roundrobin
option httpclose
option forwardfor
server s9c04cb95f3b9cd4093698bf83b73d8de6208c661352107c5193b57f33005a95f 127.0.0.1:32768 maxconn 32
listen admin
bind 127.0.0.1:8080
stats enable
After configuring haproxy.cfg, type service reload haproxy
and try navigating to all of the domains listed in turn (in our example site-a.com, site-b.com, site-c.com). If everything went according to plan you should be served with the respective content of those containers (which is why I suggested they differ in some way!).
Visualising it
It's great that it works, let's recap over what's going on here.
Consider our example with a Docker host with 3 containers that predictably match to 3 different websites (virtual hosts). Let's pretend that some of them are running different application servers under NGINX for exampe's sake, the mapping would look something like this:
As you can see according to the graph, all of the domain names point to the same docker host, and with HAProxy listening it forwards the request to the appropriate port depending on which domain was specified - and the docker host's nginx/cgi software takes over from there and the request is returned to the client seamlessly.
Automating it
That could be alot of work depending on how many containers you have for your various sites, and even a pain if those sites get redeployed. I'll propose and give a working utility to automatically give this behaviour to us. For now just see how the single frontend maps to multiple backends based on the hostname and the matching port as listed above in Docker.
See my GitHub repository here for full details but in short the config can be fully written based on running containers and changes made live by executing:
sudo python3 hapconf.py --auto-write /etc/haproxy/haproxy.cfg && sudo service haproxy reload
Closing Thoughts
As you can see, getting HAProxy to multiplex our various docker containers wasn't too difficult, and using the script we can reload the config anytime we wish with minimal effort and risk of human error.
What wasn't covered here was SSL termination, which could be done in either the host or the container, for the sake of ease and because its probably just fine for most people, I recommend SSL termination inside HAProxy, which means it'll be configured with the certificates and the containers inside will need and have zero knowledge of any SSL, meaning they can be deployed anywhere without change and the host has the SSL termination configured.