NOTE: I've written a new post that updates this post with a better implementation. However, the information here is still valid on why you want to run Traefik on worker nodes. So, read this post first and jump over to the new one.
Traefik (traefik.io) is a fantastic tool and one I've used on many projects. It just works really well and is easy to configure. In Docker mode, it listens to events and automatically reconfigures itself to allow traffic to be sent to new services and/or containers. Deploying a microserviced application is a breeze.
However, in order for it to listen, you often see Docker Compose files looking like this…
version: "3.5" services: traefik: image: traefik:latest command: --docker --docker.watch ports: - 80:80 volumes: - /var/run/docker.sock:/var/run/docker.sock
While this works just fine when running locally, it's a terrible idea when running it in a Swarm cluster. Why? In order to hear Swarm events, Traefik has to have access to a manager node (which means a placement constraint to ensure this). This means all of your cluster traffic will run through a manager node!
Per the man page, "socat is a command line based utility that establishes two bidirectional byte streams and transfers data between them." Using this utility, we can "upgrade" the Docker socket (a Unix socket) to a TCP socket. Then, services can connect to the Docker socket using plain TCP from remote locations.
If we run socat in a container that has the Docker socket mounted, we can make the Docker socket available to any other containers on the same network. If you're using Docker EE, you can further secure the network by limiting who can access it by putting it into its own collection.
Why use socat rather than just enabling remote connections on the engine socket? Great question! By doing this, we can leverage Swarm's DNS-based service discovery (don't have to lookup where the managers are located) and we can use network isolation to limit who can access it.
The Stack File
The following stack file will add the socat service and update Traefik to use the new service for its Docker endpoint.
version: "3.6" services: socat: image: alpine/socat command: tcp-listen:2375,fork,reuseaddr unix-connect:/var/run/docker.sock volumes: - /var/run/docker.sock:/var/run/docker.sock networks: - mgmt deploy: placement: constraints: - node.role == manager traefik: image: traefik:latest command: --docker --docker.endpoint=tcp://socat:2375 --docker.watch --docker.swarmMode ports: - 80:80 networks: - mgmt - app-entry deploy: placement: constraints: - node.role == worker networks: mgmt: external: true app-entry: external: true
A couple of things to note…
- The Traefik service is configured with a
docker.endpointof socat:2375. Remember that with Docker's DNS-based service discovery, this will resolve to the
- There are two networks, which you'll notice are defined externally. The reason I do this is so 1) they have exact names (rather than having a project prefix added to them) and 2) making it easier to have other services connect to them (since this is a reverse proxy after all). The
app-entrynetwork is used to communicate from Traefik to any other service (example below).
Deploying a Service
Now that we have the proxy stack, let's deploy a simple app. We'll use the ridiculous
version: "3.6" services: cats: image: mikesir87/cats networks: - app-entry deploy: labels: traefik.docker.network: app-entry traefik.backend: cats traefik.frontend.rule: "Path: /" traefik.port: 5000 placement: constraints: - node.role == worker networks: app-entry: external: true
To try it out, we'll sping up a quick Swarm cluster using Play with Docker.
- Get a quick five-node cluster (three managers and two nodes) by using the templates found by clicking on the wrench icon.
- On a manager node, run the following commands:
git clone https://github.com/mikesir87/traefik-socat-demo.git cd traefik-socat-demo docker network create --attachable --driver overlay --opt encrypted=true app-entry docker network create --attachable --driver overlay --opt encrypted=true mgmt docker stack deploy -c proxy-stack.yml proxy docker stack deploy -c app-stack.yml cats
Wait for everything and then open badge for port 80. You should see some cats now!
For kicks, you can also run this command to get a quick swarm visualizer:
docker service create --constraint 'node.role == manager' --mount type=bind,source=/var/run/docker.sock,destination=/var/run/docker.sock --publish 3000:3000 mikesir87/swarm-viz
(Yes… this runs on a manager node because I don't have it configurable to connect to a TCP socket yet. Doh!)
Wait for that to launch, and open the badge for port 3000 and you should see the Swarm with only the visualizer and socat on the manager node, with everything else on worker nodes, including Traefik!
While this works, there are a few obvious next steps to explore. Have any others to add? Feel free to comment and ask below.
- We could run the socat container in global replication so the agent runs on all manager nodes, hopefully spreading the work out more than it is right now
- We could still secure the Docker socket by setting up cert auth.
- We could run multiple replicas of Traefik to spread the load across the cluster, or even consider running that as a global service too.