Back to list of postings

Running Traefik on Worker Nodes More Securely

Note: This post is an updated version of Letting Traefik run on Worker Nodes. That post explains why we want to run on worker nodes, so I won't repeat that here.

In my previous post, I talk about the reasons to run Traefik on worker nodes. However, there's a major shortcoming with the proposed approach: the Docker socket grants too much access for most applications. Playing the hypothetical game, if Traefik were to be hacked, it has access to the Docker socket, which would grant access to the entire cluster. That sounds pretty bad. Let's change that.

Our Solution

Instead of exposing the Docker socket directly to the container (even via the use of socat), we are going to use a proxy. The proxy I've been looking at is the docker-socket-proxy project. Basically, it's HAProxy with a custom config. This proxy allows us to whitelist the operations that any application should have access to.

When configuring the proxy, you use environment variables to whitelist the available operations for clients. By default, the events, ping, and version endpoints are whitelisted. When running in Swarm mode, Traefik needs to inspect the services, tasks, and networks. So, we will enable those.

One extra tidbit… all POST requests are denied, unless you explicitly enable them. By whitelisting services, we're only authorizing read-only access. Awesome!

So… let's deploy it!

version: "3.6"
  
services:
  socket-proxy:
    image: tecnativa/docker-socket-proxy
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
    environment:
      SERVICES: 1
      TASKS: 1
      NETWORKS: 1
    networks:
      - mgmt
    deploy:
      placement:
        constraints:
          - node.role == manager

  traefik:
    image: traefik:latest
    command: --docker --docker.endpoint=tcp://socket-proxy:2375 --docker.watch --docker.swarmMode
    ports:
      - 80:80
    networks:
      - mgmt
      - app-entry
    deploy:
      placement:
        constraints:
          - node.role == worker

networks:
  mgmt:
    name: mgmt
  app-entry:
    name: app-entry

Notice that this is pretty much the same as what I had in the previous post. The only other change we made was for Traefik's docker endpoint, since we changed the service name.

But, now we have an extra layer of security in place! Even if our Traefik container were to be compromised, we would get 403 Forbidden responses if we tried to create/update/remove a service. Layered security for the win!

Comments or questions? Comment below!