Question
Since 2014 when this question has been asked, many situations had happened and many things has changed. I'm revisiting the topic again today, and I'm editing this question for the 12th time to reflect the latest changes. The question may seem long but it is arranged in the reverse chronological order, so the latest changes are at the top and feel free stop reading at any point.
The question I wanted to solve was -- how to mount host volumes into docker
containers in Dockerfile during build, i.e., having the docker run -v /export:/export
capability during docker build
.
One reason behind it, for me, is when building things in Docker, I don't want
those (apt-get install
) caches locked in a single docker, but to share/reuse
them.
That was the main reason I was asking this question. And one more reason I'm
facing today is trying to make use of a huge private repo from host which I
have to otherwise do git clone
from a private repo within docker using my
private ssh key, which I don't know how and haven't looked into yet.
Latest Update:
The Buildkit in @BMitch's answer
With that
RUN --mount
syntax, you can also bind mount read-only directories from the build-context...
it has now been built-in in docker (which I thought being a third-party tool), as long as yours' over 18.09. Mine is 20.10.7 now -- https://docs.docker.com/develop/develop-images/build_enhancements/
To enable BuildKit builds
Easiest way from a fresh install of docker is to set the DOCKER_BUILDKIT=1 environment variable when invoking the docker build command, such as:
$ DOCKER_BUILDKIT=1 docker build .
Else, you'll get:
the --mount option requires BuildKit. Refer to https://docs.docker.com/go/buildkit/ to learn how to build images with BuildKit enabled
So it'll be the perfect solution to my second use-case as explained above.
Update as of May 7, 2019:
Before docker v18.09, the correct answer should be the one that starts with:
There is a way to mount a volume during a build, but it doesn't involve Dockerfiles.
However, that was a poorly stated, organized and supported answer. When I was reinstalling my docker contains, I happened to stumble upon the following article:
Dockerize an apt-cacher-ng service
https://docs.docker.com/engine/examples/apt-cacher-
ng/
That's the docker's solution to this/my question, not directly but indirectly. It's the orthodox way docker suggests us to do. And I admit it is better than the one I was trying to ask here.
Another way is, the newly accepted answer , e.g., the Buildkit in v18.09.
Pick whichever suits you.
Was: There had been a solution -- rocker, which was not from Docker, but now that rocker is discontinued, I revert the answer back to "Not possible" again.
Old Update: So the answer is "Not possible". I can accept it as an answer as I know the issue has been extensively discussed at https://github.com/docker/docker/issues/3156. I can understand that portability is a paramount issue for docker developer; but as a docker user, I have to say I'm very disappointed about this missing feature. Let me close my argument with a quote from aforementioned discussion: " I would like to use Gentoo as a base image but definitely don't want > 1GB of Portage tree data to be in any of the layers once the image has been built. You could have some nice a compact containers if it wasn't for the gigantic portage tree having to appear in the image during the install." Yes, I can use wget or curl to download whatever I need, but the fact that merely a portability consideration is now forcing me to download > 1GB of Portage tree each time I build a Gentoo base image is neither efficient nor user friendly. Further more, the package repository WILL ALWAYS be under /usr/portage, thus ALWAYS PORTABLE under Gentoo. Again, I respect the decision, but please allow me expressing my disappointment as well in the mean time. Thanks.
Original question in details:
From
Share Directories via Volumes
http://docker.readthedocs.org/en/v0.7.3/use/working_with_volumes/
it says that Data volumes feature "have been available since version 1 of the Docker Remote API". My docker is of version 1.2.0, but I found the example given in above article not working:
# BUILD-USING: docker build -t data .
# RUN-USING: docker run -name DATA data
FROM busybox
VOLUME ["/var/volume1", "/var/volume2"]
CMD ["/usr/bin/true"]
What's the proper way in Dockerfile to mount host-mounted volumes into docker containers, via the VOLUME command?
$ apt-cache policy lxc-docker
lxc-docker:
Installed: 1.2.0
Candidate: 1.2.0
Version table:
*** 1.2.0 0
500 https://get.docker.io/ubuntu/ docker/main amd64 Packages
100 /var/lib/dpkg/status
$ cat Dockerfile
FROM debian:sid
VOLUME ["/export"]
RUN ls -l /export
CMD ls -l /export
$ docker build -t data .
Sending build context to Docker daemon 2.56 kB
Sending build context to Docker daemon
Step 0 : FROM debian:sid
---> 77e97a48ce6a
Step 1 : VOLUME ["/export"]
---> Using cache
---> 59b69b65a074
Step 2 : RUN ls -l /export
---> Running in df43c78d74be
total 0
---> 9d29a6eb263f
Removing intermediate container df43c78d74be
Step 3 : CMD ls -l /export
---> Running in 8e4916d3e390
---> d6e7e1c52551
Removing intermediate container 8e4916d3e390
Successfully built d6e7e1c52551
$ docker run data
total 0
$ ls -l /export | wc
20 162 1131
$ docker -v
Docker version 1.2.0, build fa7b24f
Answer
First, to answer "why doesn't VOLUME
work?" When you define a VOLUME
in
the Dockerfile, you can only define the target, not the source of the volume.
During the build, you will only get an anonymous volume from this. That
anonymous volume will be mounted at every RUN
command, prepopulated with the
contents of the image, and then discarded at the end of the RUN
command.
Only changes to the container are saved, not changes to the volume.
Since this question has been asked, a few features have been released that may help. First is multistage builds allowing you to build a disk space inefficient first stage, and copy just the needed output to the final stage that you ship. And the second feature is Buildkit which is dramatically changing how images are built and new capabilities are being added to the build.
For a multi-stage build, you would have multiple FROM
lines, each one
starting the creation of a separate image. Only the last image is tagged by
default, but you can copy files from previous stages. The standard use is to
have a compiler environment to build a binary or other application artifact,
and a runtime environment as the second stage that copies over that artifact.
You could have:
FROM debian:sid as builder
COPY export /export
RUN compile command here >/result.bin
FROM debian:sid
COPY --from=builder /result.bin /result.bin
CMD ["/result.bin"]
That would result in a build that only contains the resulting binary, and not the full /export directory.
Buildkit is coming out of experimental in 18.09. It's a complete redesign of
the build process, including the ability to change the frontend parser. One of
those parser changes has has implemented the RUN --mount
option which lets
you mount a cache directory for your run commands. E.g. here's one that mounts
some of the debian directories (with a reconfigure of the debian image, this
could speed up reinstalls of packages):
# syntax = docker/dockerfile:experimental
FROM debian:latest
RUN --mount=target=/var/lib/apt/lists,type=cache \
--mount=target=/var/cache/apt,type=cache \
apt-get update \
&& DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
git
You would adjust the cache directory for whatever application cache you have, e.g. $HOME/.m2 for maven, or /root/.cache for golang.
TL;DR: Answer is here: With that RUN --mount
syntax, you can also bind
mount read-only directories from the build-context. The folder must exist in
the build context, and it is not mapped back to the host or the build client:
# syntax = docker/dockerfile:experimental
FROM debian:latest
RUN --mount=target=/export,type=bind,source=export \
process export directory here...
Note that because the directory is mounted from the context, it's also mounted
read-only, and you cannot push changes back to the host or client. When you
build, you'll want an 18.09 or newer install and enable buildkit with export DOCKER_BUILDKIT=1
.
If you get an error that the mount flag isn't supported, that indicates that you either didn't enable buildkit with the above variable, or that you didn't enable the experimental syntax with the syntax line at the top of the Dockerfile before any other lines, including comments. Note that the variable to toggle buildkit will only work if your docker install has buildkit support built in, which requires version 18.09 or newer from Docker, both on the client and server.