An HTTPS Proxy for Docker providing centralized configuration and caching of any registry (quay.io, DockerHub, k8s.gcr.io)
 
 
Go to file
ricardop 19cbdfedfc 0.3.0-beta2: don't leak Authorization header from the registry to the redirected destination during @handle_redirects 2020-04-03 15:33:46 +02:00
.dockerignore Bump base image version to nginx 1.16.1 and alpine 3.11 2020-04-03 13:34:29 +02:00
.gitignore Bump base image version to nginx 1.16.1 and alpine 3.11 2020-04-03 13:34:29 +02:00
Dockerfile use specific mitmproxy version (latest in 4.x branch) 2020-04-03 14:12:43 +02:00
LICENSE Initial commit 2018-06-27 10:08:18 +02:00
Makefile Follow redirects and cache them properly 2020-03-23 10:25:41 +01:00
README.md 0.3.0-beta2: don't leak Authorization header from the registry to the redirected destination during @handle_redirects 2020-04-03 15:33:46 +02:00
create_ca_cert.sh add mitmproxy/nginx-debug inspection capabilities 2018-11-04 11:23:52 +01:00
entrypoint.sh use alpine's coreutils base64, which supports -w 0 to avoid wrapping -- thanks @miihael 2020-04-03 14:06:19 +02:00
nginx.conf 0.3.0-beta2: don't leak Authorization header from the registry to the redirected destination during @handle_redirects 2020-04-03 15:33:46 +02:00

README.md

docker-registry-proxy

TL,DR

A caching proxy for Docker; allows centralised management of registries and their authentication; caches images from any registry.

What?

Created as an evolution and simplification of docker-caching-proxy-multiple-private using the HTTPS_PROXY mechanism and injected CA root certificates instead of /etc/hosts hacks and --insecure-registry

Main feature is Docker layer/image caching, even from S3, Google Storage, etc. As a bonus it allows for centralized management of Docker registry credentials.

You configure the Docker clients (err... Kubernetes Nodes?) once, and then all configuration is done on the proxy -- for this to work it requires inserting a root CA certificate into system trusted root certs.

master is unstable/beta

  • master (and :latest Docker tag) is unstable
  • Currently stable version is 0.2.4, see 0.2.4 tag on Github

Usage

  • Run the proxy on a host close to the Docker clients
  • Expose port 3128 to the network
  • Map volume /docker_mirror_cache for up to 32gb of cached images from all registries
  • Map volume /ca, the proxy will store the CA certificate here across restarts
  • Env CACHE_MAX_SIZE (default 32g): set the max size to be used for caching local Docker image layers. Use Nginx sizes.
  • Env REGISTRIES: space separated list of registries to cache; no need to include Docker Hub, its already there.
  • Env AUTH_REGISTRIES: space separated list of hostname:username:password authentication info.
    • hostnames listed here should be listed in the REGISTRIES environment as well, so they can be intercepted.
    • For Docker Hub authentication, hostname should be auth.docker.io, username should NOT be an email, use the regular username.
    • For regular registry auth (HTTP Basic), hostname here should be the same... unless your registry uses a different auth server. This should work for quay.io also, but I have no way to test.
    • Env AUTH_REGISTRIES_DELIMITER to change the separator between authentication info. By default, a space: " ". If you use keys that contain spaces (as with Google Cloud Registry), you should update this variable, e.g. setting it to AUTH_REGISTRIES_DELIMITER=";;;". In that case, AUTH_REGISTRIES could contain something like registry1.com:user1:pass1;;;registry2.com:user2:pass2.
    • Env AUTH_REGISTRY_DELIMITER to change the separator between authentication info parts. By default, a colon: ":". If you use keys that contain single colons, you should update this variable, e.g. setting it to AUTH_REGISTRIES_DELIMITER=":::". In that case, AUTH_REGISTRIES could contain something like registry1.com:::user1:::pass1 registry2.com:::user2:::pass2.
    • For Google Container Registry (GCR), username should be _json_key and the password should be the contents of the service account JSON. Check out GCR docs. The service account key is in JSON format, it contains spaces (" ") and colons (":"). To be able to use GCR you should set AUTH_REGISTRIES_DELIMITER to something different than space (e.g. AUTH_REGISTRIES_DELIMITER=";;;") and AUTH_REGISTRY_DELIMITER to something different than a single colon (e.g. AUTH_REGISTRY_DELIMITER=":::").
docker run --rm --name docker_registry_proxy -it \
       -p 0.0.0.0:3128:3128 \
       -v $(pwd)/docker_mirror_cache:/docker_mirror_cache \
       -v $(pwd)/docker_mirror_certs:/ca \
       -e REGISTRIES="k8s.gcr.io gcr.io quay.io your.own.registry another.public.registry" \
       -e AUTH_REGISTRIES="auth.docker.io:dockerhub_username:dockerhub_password your.own.registry:username:password" \
       rpardini/docker-registry-proxy:0.3.0-beta2

Example with GCR using credentials from a service account from a key file servicekey.json:

docker run --rm --name docker_registry_proxy -it \
       -p 0.0.0.0:3128:3128 \
       -v $(pwd)/docker_mirror_cache:/docker_mirror_cache \
       -v $(pwd)/docker_mirror_certs:/ca \
       -e REGISTRIES="k8s.gcr.io gcr.io quay.io your.own.registry another.public.registry" \
       -e AUTH_REGISTRIES_DELIMITER=";;;" \
       -e AUTH_REGISTRY_DELIMITER=":::" \
       -e AUTH_REGISTRIES="gcr.io:::_json_key:::$(cat servicekey.json);;;auth.docker.io:::dockerhub_username:::dockerhub_password" \
       rpardini/docker-registry-proxy:0.3.0-beta2

Let's say you did this on host 192.168.66.72, you can then curl http://192.168.66.72:3128/ca.crt and get the proxy CA certificate.

Configuring the Docker clients / Kubernetes nodes

On each Docker host that is to use the cache:

  • Configure Docker proxy pointing to the caching server
  • Add the caching server CA certificate to the list of system trusted roots.
  • Restart dockerd

Do it all at once, tested on Ubuntu Xenial, which is systemd based:

# Add environment vars pointing Docker to use the proxy
mkdir -p /etc/systemd/system/docker.service.d
cat << EOD > /etc/systemd/system/docker.service.d/http-proxy.conf
[Service]
Environment="HTTP_PROXY=http://192.168.66.72:3128/"
Environment="HTTPS_PROXY=http://192.168.66.72:3128/"
EOD

# Get the CA certificate from the proxy and make it a trusted root.
curl http://192.168.66.72:3128/ca.crt > /usr/share/ca-certificates/docker_registry_proxy.crt
echo "docker_registry_proxy.crt" >> /etc/ca-certificates.conf
update-ca-certificates --fresh

# Reload systemd
systemctl daemon-reload

# Restart dockerd
systemctl restart docker.service

Testing

Clear dockerd of everything not currently running: docker system prune -a -f beware

Then do, for example, docker pull k8s.gcr.io/kube-proxy-amd64:v1.10.4 and watch the logs on the caching proxy, it should list a lot of MISSes.

Then, clean again, and pull again. You should see HITs! Success.

Do the same for docker pull ubuntu and rejoice.

Test your own registry caching and authentication the same way; you don't need docker login, or .docker/config.json anymore.

Gotchas

  • If you authenticate to a private registry and pull through the proxy, those images will be served to any client that can reach the proxy, even without authentication. beware
  • Repeat, this will make your private images very public if you're not careful.
  • Currently you cannot push images while using the proxy which is a shame. PRs welcome.
  • Setting this on Linux is relatively easy. On Mac and Windows the CA-certificate part will be very different but should work in principle.

Why not use Docker's own registry, which has a mirror feature?

Yes, Docker offers Registry as a pull through cache, unfortunately it only covers the DockerHub case. It won't cache images from quay.io, k8s.gcr.io, gcr.io, or any such, including any private registries.

That means that your shiny new Kubernetes cluster is now a bandwidth hog, since every image will be pulled from the Internet on every Node it runs on, with no reuse.

This is due to the way the Docker "client" implements --registry-mirror, it only ever contacts mirrors for images with no repository reference (eg, from DockerHub). When a repository is specified dockerd goes directly there, via HTTPS (and also via HTTP if included in a --insecure-registry list), thus completely ignoring the configured mirror.

Docker itself should provide this.

Yeah. Docker Inc should do it. So should NPM, Inc. Wonder why they don't. 😼

TODO:

  • Allow using multiple credentials for DockerHub; this is possible since the /token request includes the wanted repo as a query string parameter.
  • Test and make auth work with quay.io, unfortunately I don't have access to it (hint, hint, quay)
  • Hide the mitmproxy building code under a Docker build ARG.
  • I hope that in the future this can also be used as a "Developer Office" proxy, where many developers on a fast local network share a proxy for bandwidth and speed savings; work is ongoing in this direction.