2020-10-20 14:11:41 +00:00
![GitHub Workflow Status ](https://img.shields.io/github/workflow/status/rpardini/docker-registry-proxy/master-latest?label=%3Alatest%20from%20master )
![GitHub tag (latest by date) ](https://img.shields.io/github/v/tag/rpardini/docker-registry-proxy?label=last%20tagged%20release )
![GitHub Workflow Status ](https://img.shields.io/github/workflow/status/rpardini/docker-registry-proxy/tags?label=last%20tagged%20release )
![Docker Image Size (latest semver) ](https://img.shields.io/docker/image-size/rpardini/docker-registry-proxy?sort=semver )
2020-10-27 11:48:26 +00:00
![Docker Pulls ](https://img.shields.io/docker/pulls/rpardini/docker-registry-proxy )
2020-10-06 09:16:03 +00:00
## TL,DR
2018-06-27 16:13:56 +00:00
2020-10-06 09:06:42 +00:00
A caching proxy for Docker; allows centralised management of (multiple) registries and their authentication; caches images from *any* registry.
2018-06-27 16:13:56 +00:00
2020-10-06 09:16:03 +00:00
## What?
2018-06-27 16:13:56 +00:00
2020-10-06 09:06:42 +00:00
Essentially, it's a [man in the middle ](https://en.wikipedia.org/wiki/Man-in-the-middle_attack ): an intercepting proxy based on `nginx` , to which all docker traffic is directed using the `HTTPS_PROXY` mechanism and injected CA root certificates.
2018-06-27 16:13:56 +00:00
2020-10-06 09:06:42 +00:00
The main feature is Docker layer/image caching, including layers served from S3, Google Storage, etc.
As a bonus it allows for centralized management of Docker registry credentials, which can in itself be the main feature, eg in Kubernetes environments.
2018-06-28 23:39:02 +00:00
You configure the Docker clients (_err... Kubernetes Nodes?_) once, and then all configuration is done on the proxy --
for this to work it requires inserting a root CA certificate into system trusted root certs.
2018-06-27 16:13:56 +00:00
2020-10-08 13:43:31 +00:00
## master/:latest is unstable/beta
2020-04-03 12:16:09 +00:00
2020-10-08 13:43:31 +00:00
- `:latest` and `:latest-debug` Docker tag is unstable, built from master, and amd64-only
2020-10-30 09:59:59 +00:00
- Production/stable is `0.4.2` , see [0.4.2 tag on Github ](https://github.com/rpardini/docker-registry-proxy/tree/0.4.2 ) - this image is multi-arch amd64/arm64
2020-10-08 14:49:52 +00:00
- The previous version is `0.3.0` , see [0.3.0 tag on Github ](https://github.com/rpardini/docker-registry-proxy/tree/0.3.0 ) (amd64 only)
2020-04-03 12:16:09 +00:00
2020-10-06 09:16:03 +00:00
## Usage
2018-06-27 16:13:56 +00:00
2020-10-06 09:06:42 +00:00
- Run the proxy on a host close (network-wise: high bandwidth, same-VPC, etc) to the Docker clients
2018-06-28 23:45:16 +00:00
- Expose port 3128 to the network
2020-10-06 09:06:42 +00:00
- Map volume `/docker_mirror_cache` for up to `CACHE_MAX_SIZE` (32gb by default) of cached images across all cached registries
- Map volume `/ca` , the proxy will store the CA certificate here across restarts. **Important** this is security sensitive.
2019-09-27 18:44:07 +00:00
- Env `CACHE_MAX_SIZE` (default `32g` ): set the max size to be used for caching local Docker image layers. Use [Nginx sizes ](http://nginx.org/en/docs/syntax.html ).
2020-10-06 09:16:03 +00:00
- Env `REGISTRIES` : space separated list of registries to cache; no need to include DockerHub, its already done internally.
2019-08-04 02:40:03 +00:00
- Env `AUTH_REGISTRIES` : space separated list of `hostname:username:password` authentication info.
2018-11-04 16:15:42 +00:00
- `hostname` s listed here should be listed in the REGISTRIES environment as well, so they can be intercepted.
2020-10-06 09:06:42 +00:00
- Env `AUTH_REGISTRIES_DELIMITER` to change the separator between authentication info. By default, a space: "` `". If you use keys that contain spaces (as with Google Cloud Registry), you should update this variable, e.g. setting it to ` AUTH_REGISTRIES_DELIMITER=";;;"`. In that case, `AUTH_REGISTRIES` could contain something like `registry1.com:user1:pass1;;;registry2.com:user2:pass2` .
- Env `AUTH_REGISTRY_DELIMITER` to change the separator between authentication info *parts* . By default, a colon: "`:`". If you use keys that contain single colons, you should update this variable, e.g. setting it to `AUTH_REGISTRIES_DELIMITER=":::"` . In that case, `AUTH_REGISTRIES` could contain something like `registry1.com:::user1:::pass1 registry2.com:::user2:::pass2` .
2020-10-08 01:02:23 +00:00
### Simple (no auth, all cache)
```bash
docker run --rm --name docker_registry_proxy -it \
-p 0.0.0.0:3128:3128 \
-v $(pwd)/docker_mirror_cache:/docker_mirror_cache \
-v $(pwd)/docker_mirror_certs:/ca \
2020-10-30 09:59:59 +00:00
rpardini/docker-registry-proxy:0.4.2
2020-10-08 01:02:23 +00:00
```
### DockerHub auth
2020-10-06 09:06:42 +00:00
For Docker Hub authentication:
- `hostname` should be `auth.docker.io`
- `username` should NOT be an email, use the regular username
2018-06-27 16:13:56 +00:00
```bash
2018-06-28 23:55:56 +00:00
docker run --rm --name docker_registry_proxy -it \
2018-11-04 16:34:54 +00:00
-p 0.0.0.0:3128:3128 \
-v $(pwd)/docker_mirror_cache:/docker_mirror_cache \
-v $(pwd)/docker_mirror_certs:/ca \
2018-11-04 16:15:42 +00:00
-e REGISTRIES="k8s.gcr.io gcr.io quay.io your.own.registry another.public.registry" \
2018-11-04 16:34:54 +00:00
-e AUTH_REGISTRIES="auth.docker.io:dockerhub_username:dockerhub_password your.own.registry:username:password" \
2020-10-30 09:59:59 +00:00
rpardini/docker-registry-proxy:0.4.2
2018-06-27 16:13:56 +00:00
```
2020-10-08 01:02:23 +00:00
### Simple registries auth (HTTP Basic auth)
2020-10-06 09:06:42 +00:00
For regular registry auth (HTTP Basic), the `hostname` should be the registry itself... unless your registry uses a different auth server.
See the example above for DockerHub, adapt the `your.own.registry` parts (in both ENVs).
This should work for quay.io also, but I have no way to test.
2020-10-08 01:02:23 +00:00
### GitLab auth
2020-10-06 09:06:42 +00:00
GitLab may use a different/separate domain to handle the authentication procedure.
Just like DockerHub uses `auth.docker.io` , GitLab uses its primary (git) domain for the authentication.
If you run GitLab on `git.example.com` and its registry on `reg.example.com` , you need to include both in `REGISTRIES` and use the primary domain for `AUTH_REGISTRIES` .
For GitLab.com itself the authentication domain should be `gitlab.com` .
```bash
docker run --rm --name docker_registry_proxy -it \
-p 0.0.0.0:3128:3128 \
-v $(pwd)/docker_mirror_cache:/docker_mirror_cache \
-v $(pwd)/docker_mirror_certs:/ca \
-e REGISTRIES="reg.example.com git.example.com" \
-e AUTH_REGISTRIES="git.example.com:USER:PASSWORD" \
2020-10-30 09:59:59 +00:00
rpardini/docker-registry-proxy:0.4.2
2020-10-06 09:06:42 +00:00
```
2020-10-08 01:02:23 +00:00
### Google Container Registry (GCR) auth
2020-10-06 09:06:42 +00:00
For Google Container Registry (GCR), username should be `_json_key` and the password should be the contents of the service account JSON.
Check out [GCR docs ](https://cloud.google.com/container-registry/docs/advanced-authentication#json_key_file ).
The service account key is in JSON format, it contains spaces ("` `") and colons ("` :`").
To be able to use GCR you should set `AUTH_REGISTRIES_DELIMITER` to something different than space (e.g. `AUTH_REGISTRIES_DELIMITER=";;;"` ) and `AUTH_REGISTRY_DELIMITER` to something different than a single colon (e.g. `AUTH_REGISTRY_DELIMITER=":::"` ).
2019-08-04 02:40:03 +00:00
Example with GCR using credentials from a service account from a key file `servicekey.json` :
```bash
docker run --rm --name docker_registry_proxy -it \
-p 0.0.0.0:3128:3128 \
-v $(pwd)/docker_mirror_cache:/docker_mirror_cache \
-v $(pwd)/docker_mirror_certs:/ca \
-e REGISTRIES="k8s.gcr.io gcr.io quay.io your.own.registry another.public.registry" \
2019-08-04 03:00:34 +00:00
-e AUTH_REGISTRIES_DELIMITER=";;;" \
-e AUTH_REGISTRY_DELIMITER=":::" \
-e AUTH_REGISTRIES="gcr.io:::_json_key:::$(cat servicekey.json);;;auth.docker.io:::dockerhub_username:::dockerhub_password" \
2020-10-30 09:59:59 +00:00
rpardini/docker-registry-proxy:0.4.2
2019-08-04 02:40:03 +00:00
```
2020-10-06 09:16:03 +00:00
## Configuring the Docker clients / Kubernetes nodes
2018-06-27 16:13:56 +00:00
2020-10-06 09:06:42 +00:00
Let's say you setup the proxy on host `192.168.66.72` , you can then `curl http://192.168.66.72:3128/ca.crt` and get the proxy CA certificate.
2018-06-27 16:13:56 +00:00
2018-06-28 23:39:02 +00:00
On each Docker host that is to use the cache:
2018-06-27 16:13:56 +00:00
2018-06-28 23:55:56 +00:00
- [Configure Docker proxy ](https://docs.docker.com/config/daemon/systemd/#httphttps-proxy ) pointing to the caching server
2018-06-28 23:39:02 +00:00
- Add the caching server CA certificate to the list of system trusted roots.
- Restart `dockerd`
2018-06-27 16:13:56 +00:00
2020-10-08 01:02:23 +00:00
Do it all at once, tested on Ubuntu Xenial, Bionic, and Focal, all systemd based:
2018-06-27 16:13:56 +00:00
```bash
2018-06-28 23:39:02 +00:00
# Add environment vars pointing Docker to use the proxy
2018-07-04 09:40:33 +00:00
mkdir -p /etc/systemd/system/docker.service.d
2018-06-28 23:39:02 +00:00
cat < < EOD > /etc/systemd/system/docker.service.d/http-proxy.conf
[Service]
Environment="HTTP_PROXY=http://192.168.66.72:3128/"
Environment="HTTPS_PROXY=http://192.168.66.72:3128/"
EOD
# Get the CA certificate from the proxy and make it a trusted root.
2018-06-28 23:55:56 +00:00
curl http://192.168.66.72:3128/ca.crt > /usr/share/ca-certificates/docker_registry_proxy.crt
2018-07-04 09:40:33 +00:00
echo "docker_registry_proxy.crt" >> /etc/ca-certificates.conf
2018-06-28 23:39:02 +00:00
update-ca-certificates --fresh
# Reload systemd
systemctl daemon-reload
# Restart dockerd
systemctl restart docker.service
2018-06-27 16:13:56 +00:00
```
2020-10-06 09:16:03 +00:00
## Testing
2018-06-27 16:13:56 +00:00
2018-06-28 23:39:02 +00:00
Clear `dockerd` of everything not currently running: `docker system prune -a -f` *beware*
2018-06-27 16:13:56 +00:00
Then do, for example, `docker pull k8s.gcr.io/kube-proxy-amd64:v1.10.4` and watch the logs on the caching proxy, it should list a lot of MISSes.
2018-06-28 23:39:02 +00:00
2018-06-27 16:13:56 +00:00
Then, clean again, and pull again. You should see HITs! Success.
2018-06-28 23:39:02 +00:00
Do the same for `docker pull ubuntu` and rejoice.
Test your own registry caching and authentication the same way; you don't need `docker login` , or `.docker/config.json` anymore.
2018-06-27 16:13:56 +00:00
2020-10-08 01:02:23 +00:00
## Developing/Debugging
2020-10-30 11:09:10 +00:00
Since `0.4` there is a separate `-debug` version of the image, which includes `nginx-debug` , and (since 0.5.x) has a `mitmproxy` (actually `mitmweb` ) inserted after the CONNECT proxy but before the caching logic, and a second `mitmweb` between the caching layer and DockerHub.
This allows very in-depth debugging. Use sparingly, and definitely not in production.
2020-10-08 01:02:23 +00:00
```bash
docker run --rm --name docker_registry_proxy -it
2020-10-30 11:09:10 +00:00
-e DEBUG_NGINX=true -e DEBUG=true -e DEBUG_HUB=true -p 0.0.0.0:8081:8081 -p 0.0.0.0:8082:8082 \
2020-10-08 01:02:23 +00:00
-p 0.0.0.0:3128:3128 \
-v $(pwd)/docker_mirror_cache:/docker_mirror_cache \
-v $(pwd)/docker_mirror_certs:/ca \
2020-10-30 09:59:59 +00:00
rpardini/docker-registry-proxy:0.4.2-debug
2020-10-08 01:02:23 +00:00
```
2020-10-30 11:09:10 +00:00
- `DEBUG=true` enables the mitmweb proxy between Docker clients and the caching layer, accessible on port 8081
- `DEBUG_HUB=true` enables the mitmweb proxy between the caching layer and DockerHub, accessible on port 8082 (since 0.5.x)
- `DEBUG_NGINX=true` enables nginx-debug and debug logging, which probably is too much. Seriously.
2020-10-08 01:02:23 +00:00
2020-10-06 09:16:03 +00:00
## Gotchas
2018-06-27 16:13:56 +00:00
2018-06-28 23:39:02 +00:00
- If you authenticate to a private registry and pull through the proxy, those images will be served to any client that can reach the proxy, even without authentication. *beware*
2020-10-06 09:06:42 +00:00
- Repeat, **this will make your private images very public if you're not careful** .
2018-06-28 23:55:56 +00:00
- **Currently you cannot push images while using the proxy** which is a shame. PRs welcome.
2020-10-06 09:06:42 +00:00
- Setting this on Linux is relatively easy.
- On Mac and Windows the CA-certificate part will be very different but should work in principle.
- Please send PRs with instructions for Windows and Mac if you succeed!
2018-06-28 23:45:16 +00:00
2020-10-06 09:16:03 +00:00
### Why not use Docker's own registry, which has a mirror feature?
2018-06-28 23:45:16 +00:00
Yes, Docker offers [Registry as a pull through cache ](https://docs.docker.com/registry/recipes/mirror/ ), *unfortunately*
it only covers the DockerHub case. It won't cache images from `quay.io` , `k8s.gcr.io` , `gcr.io` , or any such, including any private registries.
That means that your shiny new Kubernetes cluster is now a bandwidth hog, since every image will be pulled from the
Internet on every Node it runs on, with no reuse.
This is due to the way the Docker "client" implements `--registry-mirror` , it only ever contacts mirrors for images
with no repository reference (eg, from DockerHub).
When a repository is specified `dockerd` goes directly there, via HTTPS (and also via HTTP if included in a
`--insecure-registry` list), thus completely ignoring the configured mirror.
2020-10-06 09:16:03 +00:00
### Docker itself should provide this.
2018-06-28 23:45:16 +00:00
Yeah. Docker Inc should do it. So should NPM, Inc. Wonder why they don't. 😼
2018-11-04 16:15:42 +00:00
### TODO:
2020-10-06 09:16:03 +00:00
- [ ] Test and make auth work with quay.io, unfortunately I don't have access to it (_hint, hint, quay_)
- [x] Hide the mitmproxy building code under a Docker build ARG.
2020-10-08 01:02:23 +00:00
- [ ] "Developer Office" proxy scenario, where many developers on a fast LAN share a proxy for bandwidth and speed savings (already works for pulls, but messes up pushes, which developers tend to use a lot)