-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why is there volume for data in the first place? #255
Comments
Volumes will also be faster than using the container's internal storage, but I agree it tends to very quickly clutter up the disk with anonymous volumes, since starting the container without any volumes specified is something I at least mostly do for quick testing. |
I just use a series of alias's and periodically cleanup stopped containers, unused volumes, and dangling images. alias dclean='docker ps -aq | xargs --no-run-if-empty docker rm'
alias dcleanvol="docker volume ls | awk '/^local/ { print \$2 }' | xargs --no-run-if-empty docker volume rm"
alias ddangling='docker images --filter dangling=true -q | sort -u | xargs --no-run-if-empty docker rmi' With upcoming docker Edit: If you do |
@yosifkit this is fine workaround for local (and, probably, swarm - haven't worked with it) containers, but as soon as nomad / kubernetes / other orchestration hero is hit, things go bad, sometimes you don't have easy automation for the node itself at all. |
Agree with etki. Biggest problem for us is we can't save data in the image itself. Why not leave it for the users to decide whether they want to use volumes or not. They can always add it later but as etki mentioned there's no way to remove it. |
we have the same problem as shitalm - we would like to add data to the image itself so we can e.g. prepare test/demo data or just to deliver static/readonly data. It tedious to have to copy dockerfiles, remove the volume instruction and build it ourselves... I could also live with a separate image with a tag - 5.7-no-volume |
If you run your container with "--datadir=...", you should be able to
adjust where MySQL stores data (or put "datadir" into a configuration file
in your image), which will then free you from the volume.
|
@tianon it won't. The data will be stored in other place, but untracked directory will still be created on the host, consuming an inode. This is not as bad, but still something that has zero positive effect. |
As far as I know it's now recommended by docker team not to declare volumes in base images (or in this case official images). It would be better in my opinion to document the usage of volume, but not declare it in the image as it depends on the end user case, how he/she is going to store the data (local vs production cases). |
Actually, there is one other effect: https://docs.docker.com/engine/reference/builder/#notes-about-specifying-volumes If we just drop the VOLUME statement, the database will still be there, and no database initialization will be performed for basic testing of the image. I don't think this would require anything more than clearing out the directory after installing, though. |
I feel like removing the volume would break many users that rely on the volume when using docker-compose. Compose tries hard to keep the volume between restarts of the container to persist the data and these users would suddenly see new deployments unable to survive a re-creation. My opinion is that if it is such a problem to have a volume defined, then docker needs to provide an I would think that most users would rather their database data preserved by default rather than discovering that their data has been automatically deleted when they did a There is not a good alternative for telling the user where persistent data lives. Labels are not standardized and many users skip over the Docker Hub documentation. Being able to build an image that ships with a database already initialized is still possible and the automatic volume would be left empty. FROM mysql:5.7
CMD ["--datadir=/sql"]
# assuming ./sql-datadir contains an already initialized database
COPY ./sql-datadir/* /sql/
# on startup the entrypoint script will detect the already initialized database and start right up
# leaving /var/lib/mysql empty or.... without having to use a different data directory: FROM mysql:5.7
# ./sql-datadir contains a database dump of *.sql files
COPY ./sql-datadir/* /docker-entrypoint-initdb.d/
# initdb logic will restore the database via the sql files in alphanumeric order on first container start
# users will have to `docker rm -vf sql-container` when a new image is pulled with a new database dump @ltangvald, as for the automatic population of |
Would it be simple to tag the image twice for both use cases? e.g. do everything the same sans
Image behavior stays the same for existing tags while we allow the other use case for those who want it. |
@yosifkit I hadn't considered the compose use case @bflad In general I don't think we want more files to maintain (though it's simple enough), but when/if we get a template system in place (discussed in issue #289) this might be an option. |
@yosifkit I agree with you regarding an "UNVOLUME" command, however I don't see Docker implementing that anytime in the near future. Until that occurs we're basically stuck telling educated Docker users that they need to go copy the Dockerfile from the MySQL image that they want and create their own image with the VOLUME line commented/deleted. Preventing the user from automatically receiving potential security updates or writing a script to automate the process (which makes me uneasy, but I have seriously considered it...). I'm a heavy user of Compose (doing a lot of local-dev with Docker) and would have been perfectly fine seeing the documentation on Docker Hub stating that I need to define a volume in my run command or docker-compose service. I know you stated that many users skip over the Docker Hub documentation, but the image is already relatively useless if you don't scroll down to read the section regarding environment variables. The Compose/Stack documentation appears before that section, which could certainly include a sample Volume definition with a comment above it, something like: # Use root/example as user/password credentials
version: '3.1'
services:
db:
image: mysql
restart: always
environment:
MYSQL_ROOT_PASSWORD: example
# Use a volume to support persistent storage on container restart.
volumes:
- data-volume:/var/lib/mysql
adminer:
image: adminer
restart: always
ports:
- 8080:8080
volumes:
data-volume: I'd be happy to write a suggestion for the "Where to Store Data" section as well, if that's a hangup. |
If someone wants to work on that, it may be implemented, see moby/moby#3465 (comment) and moby/moby#3465 (comment) Nobody so far offered working on it though |
The request for docker to support an "UNSET" feature is there only to help people to cope with bad images. About this sentence:
it is completely wrong. In every company and project I have ever worked, when it is desired to persist data between docker restarts, either you don't delete the container or you explicitly mount a volume. I have never seen someone relying on (or being in love with) anonymous volumes in the real world. If you don't want to break the (frustrating) behavior of this image, you should really adopt another tag and offer both the alternatives. |
I would second the request to remove the volume definition from the Dockerfile. As a user of docker-compose I see the built- and runtime-configuration nicely separated, the docker-compose.yml refers to the build environment (including the Dockerfile) for the underlying image and it allows to define the runtime configuration of the actual container (including, volumes and ports). |
Just hit this issue as well, took several hours of a junior devs time before we found the underlying cause as we didn't think this would be included by default and was a big surprise. Having a |
Addresses issues like docker-library#255 and docker-library#214
I'm confused about this part:
How is docker-compose relevant to the problem? Could someone give me an example use case which will be affected by the removal of VOLUME direction? |
$ cat Dockerfile
FROM bash
VOLUME /foo
$ cat docker-compose.yml
version: '3.8'
services:
bash:
build: .
tty: true
$ docker-compose build
...
$ docker-compose up -d
Starting tmp_bash_1 ... done
$ docker-compose exec bash touch /foo/bar
$ docker-compose exec bash ls /foo
bar
$ docker-compose up -d --force-recreate
Recreating tmp_bash_1 ...
$ docker-compose exec bash ls /foo
bar (Docker Compose works extra hard to keep even anonymous unspecified volumes around and attached to the appropriate container.) |
@tianon Couldn't you just add an explicit anonymous volume to the docker command or docker-compose file, like volumes:
- /foo The above has the benefits of having the volume defined explicitly, so as to not catch people by surprise with lots of anonymous volumes that shouldn't have been created, and not reusing persisted data that should not have been persisted in the fist place (and also to not have the need of hacks like defining the mysql data directory in another place, that doesn't stop the volume creation, anyway). Furthermore, as @ufoscout said:
Last year oracle removed volume from their official image, and I don't know about it impacting people negatively (although the mysql image is probably more used): |
Hi.
My inquiry may seem strange, but i really don't get it.
Why does MySQL Dockerfile contain VOLUME directive? From my perspective, there are more cons than pros:
I know i'll cause huge 'watch it, kid!' next second, but shouldn't it be dropped? I really don't see any huge benefits over drawbacks it brings in.
The text was updated successfully, but these errors were encountered: