Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker and the storage of its image data long term RFC #288

Open
tatarsky opened this issue Jul 14, 2015 · 25 comments
Open

Docker and the storage of its image data long term RFC #288

tatarsky opened this issue Jul 14, 2015 · 25 comments

Comments

@tatarsky
Copy link
Contributor

(RFC=Request for Comment)

I am trying to get a clear understanding of the way docker should best store images in what it calls its "root area" which is currently set to /scratch/docker compared to the default of /var/lib/docker which would be a smaller area on a node.

And also get the requirements for the space and usage of docker. We currently have it in a form of "open season test mode" and have for awhile.

That root area consists of more than just images. It is the area where I believe much of the container state is also kept so I don't think you can just toss the entire region into a shared GPFS directory. Or at least I would have to experiment with that and I can sort of see there is going to be collisions if I do.

My brief reading seems to point to the use of a local repository if a large collection of images and their local modifications is required. With its backend storage then on some shared media. This requires a daemon be run and there appear to be two projects with implementations.

All this would be considered an enhancement but I said I would look into it and so here is the start of my trying to understand the requirements to move forward.

@tatarsky
Copy link
Contributor Author

I am going to add a few of the items I look at on these topics as I go for reference.
http://odewahn.github.io/docker-jumpstart/
https://github.com/docker/docker-registry

@jchodera
Copy link
Member

jchodera commented Aug 9, 2015

These are all good questions. I don't have any good answers yet. For now, I've been maintaining scripts to sweep the nodes of old images and containers corresponding to our docker image (jchodera/fah-client), but it would be much easier if there was a concept of user ownership over images and containers.

@lzamparo
Copy link

lzamparo commented Oct 7, 2015

@jchodera: do your scripts allow the jchodera/fah-client image to be deployed in the image cache on all nodes? If so, would you mind sharing?

Currently I find that my own docker jobs often involve a first (very annoying) step of pulling an image from dockerhub. Though on some nodes where I've done interactive work, the image is in the root area already, and is visible via docker images. This adds another layer of complexity when scripting, and is most undesirable.

@tatarsky maybe collisions could be avoided by having users own their own docker images by committing them with their own HAL ids? Or do I not understand the entirety of creating a common root area?

@tatarsky
Copy link
Contributor Author

tatarsky commented Oct 7, 2015

This is as I note why I am requesting comments. I do not have docker setup in a way to use their own HAL ids as I believe that requires sudo (compared to group docker). If you know otherwise advise.

@tatarsky
Copy link
Contributor Author

tatarsky commented Oct 7, 2015

BTW, I believe what @jchodera is doing is registering his image as a variant at the docker hub. Thus preventing collision because he's made it a unique image...unless I misunderstand.

@lzamparo
Copy link

lzamparo commented Oct 7, 2015

I think as long as users commit their changes and tag the revision, there should be no collision problem. FYI, I'm also registering my images as variants using docker hub; this prevents collisions with other users.

There does seem to be detritus building up in the /scratch/docker of different nodes. For instance, I'm working now on gpu-2-4 and see:

[zamparol@gpu-2-4 fit_hic]$ docker images
REPOSITORY                                     TAG                 IMAGE ID            CREATED             VIRTUAL SIZE
jchodera/docker-core-linux-build-environment   latest              8239a3e4f625        5 weeks ago         3.867 GB
ubuntu                                         fbcunn              212c4aafa747        7 weeks ago         9.808 GB
ubuntu                                         fbcunn_final        ab910b556361        7 weeks ago         6.293 GB
jchodera/docker-fah-client                     latest              a2f867fc1c4b        8 weeks ago         2.079 GB
<none>                                         <none>              6b06d5cf3d45        8 weeks ago         2.079 GB
ubuntu                                         latest              13b176913597        10 weeks ago        197.8 MB
kaixhin/cuda-torch                             latest              713c712e4a87        3 months ago        3.154 GB
centos                                         latest              ae0c2d0bdc10        11 months ago       224 MB

On other nodes where I've done previous docker work, I've got docker images available locally to be spun up.

@tatarsky
Copy link
Contributor Author

tatarsky commented Oct 7, 2015

Yep. Part of this RFC (which has been pretty "C" or comment light) is "so, when should those images be deleted" as well. Weeks? Months? I seldom am going to make such decisions without user input and consensus so feel free to define a retention period.

Currently I am only looking for stray docker images still running due to being not scheduled with the proper exit conditions. I use docker ps and I carefully check before deciding to docker kill

@lzamparo
Copy link

lzamparo commented Oct 7, 2015

I just ran a test to see if I could make a change to the version of ubuntu:fbcunn_final:

[zamparol@gpu-2-4 fit_hic]$ docker images
REPOSITORY                                     TAG                 IMAGE ID            CREATED             VIRTUAL SIZE
jchodera/docker-core-linux-build-environment   latest              8239a3e4f625        5 weeks ago         3.867 GB
ubuntu                                         fbcunn              212c4aafa747        7 weeks ago         9.808 GB
ubuntu                                         fbcunn_final        ab910b556361        7 weeks ago         6.293 GB
jchodera/docker-fah-client                     latest              a2f867fc1c4b        8 weeks ago         2.079 GB
<none>                                         <none>              6b06d5cf3d45        8 weeks ago         2.079 GB
ubuntu                                         latest              13b176913597        10 weeks ago        197.8 MB
kaixhin/cuda-torch                             latest              713c712e4a87        3 months ago        3.154 GB
centos                                         latest              ae0c2d0bdc10        11 months ago       224 MB
[zamparol@gpu-2-4 fit_hic]$ docker run -t -i ubuntu:fbcunn_final /bin/bash
root@9e2967938dd7:/# ls
bin  boot  dev  etc  home  initrd.img  lib  lib32  lib64  media  mnt  opt  proc  root  run  sbin  srv  sys  tmp  usr  var  vmlinuz
root@9e2967938dd7:/# cd home
root@9e2967938dd7:/home# mkdir test_lee_woo
root@9e2967938dd7:/home# cd test_lee_woo/
root@9e2967938dd7:/home/test_lee_woo# vim test.txt
root@9e2967938dd7:/home/test_lee_woo# exit
[zamparol@gpu-2-4 fit_hic]$ docker commit -m "test alteration ubuntu fbcunn_final minor alteration, changing the version tag." -a "Lee Zamparo" 9e2967938dd7 ubuntu:fbcunn_lee
b2e8fcaf0656d489f8274c1a93154895f37a430664b42c736f67bb69dacc4dd5
[zamparol@gpu-2-4 fit_hic]$ docker images
REPOSITORY                                     TAG                 IMAGE ID            CREATED             VIRTUAL SIZE
ubuntu                                         fbcunn_lee          b2e8fcaf0656        2 minutes ago       6.293 GB
jchodera/docker-core-linux-build-environment   latest              8239a3e4f625        5 weeks ago         3.867 GB
ubuntu                                         fbcunn              212c4aafa747        7 weeks ago         9.808 GB
ubuntu                                         fbcunn_final        ab910b556361        7 weeks ago         6.293 GB
jchodera/docker-fah-client                     latest              a2f867fc1c4b        8 weeks ago         2.079 GB
<none>                                         <none>              6b06d5cf3d45        8 weeks ago         2.079 GB
ubuntu                                         latest              13b176913597        10 weeks ago        197.8 MB
kaixhin/cuda-torch                             latest              713c712e4a87        3 months ago        3.154 GB
centos                                         latest              ae0c2d0bdc10        11 months ago       224 MB

So my new docker image ubuntu:fbcunn_lee is now available in the root area, ready to be spun up. Maybe the docker policy could be to either:

  1. Use tags to version your own images or
  2. Explicitly rename and commit them using a docker hub ID

I think (2) would be safer in preventing people from clobbering each other's images.

@tatarsky
Copy link
Contributor Author

tatarsky commented Oct 7, 2015

I'm fine with that. I am only aware of a small number of people using docker on a regular basis. If they wish to disagree, this is the place! Enforcing that may require some additional steps but I think this makes the most sense.

@lzamparo
Copy link

lzamparo commented Oct 7, 2015

Well, how much space in /scratch are idle or discarded images using?

If it's a significant fraction, I'd vote for deleting from the cache on the order of weeks, especially if people are committing their images to docker hub, where they can be pulled easily. This should be pretty simple to set up, given that you can use github credentials to get a docker hub account IIRC.

@lzamparo
Copy link

lzamparo commented Oct 7, 2015

But we should probably have Chodera lab and Raetsch lab ppl weigh in.

@tatarsky
Copy link
Contributor Author

tatarsky commented Oct 7, 2015

Concur. I'll take a look at the space consumed. In general the /scratch drives are quite empty.

@lzamparo
Copy link

lzamparo commented Oct 7, 2015

@akahles, @karalets, @jchodera : any preference?

@jchodera
Copy link
Member

jchodera commented Oct 9, 2015

@jchodera: do your scripts allow the jchodera/fah-client image to be deployed in the image cache on all nodes? If so, would you mind sharing?

I used a very simple set of scripts to wipe and deploy docker images to all nodes:

#!/bin/tcsh

foreach node ( `cat nodelist` )
echo $node
ssh $node "docker rmi jchodera/docker-fah-client"
end

and

#!/bin/tcsh
foreach node ( `cat nodelist` )
echo $node
ssh $node "docker pull jchodera/docker-fah-client"
end

where nodelist is this file

gpu-1-4
gpu-1-5
gpu-1-6
gpu-1-7
gpu-1-8
gpu-1-9
gpu-1-10
gpu-1-11
gpu-1-12
gpu-1-13
gpu-1-14
gpu-1-15
gpu-1-16
gpu-1-17
gpu-2-4
gpu-2-5
gpu-2-6
gpu-2-7
gpu-2-8
gpu-2-9
gpu-2-10
gpu-2-11
gpu-2-12
gpu-2-13
gpu-2-14
gpu-2-15
gpu-2-16
gpu-2-17
gpu-3-8
gpu-3-9

Not very pretty, but this got the job done.

@jchodera
Copy link
Member

jchodera commented Oct 9, 2015

BTW, I believe what @jchodera is doing is registering his image as a variant at the docker hub. Thus preventing collision because he's made it a unique image...unless I misunderstand.

I am using dockerhub to automatically build and host docker images. It's great. You just point it to a github repo with your Dockerfile, like this project was:
https://hub.docker.com/r/jchodera/docker-fah-client/

@jchodera
Copy link
Member

jchodera commented Oct 9, 2015

Well, how much space in /scratch are idle or discarded images using?

If it's a significant fraction, I'd vote for deleting from the cache on the order of weeks, especially if people are committing their images to docker hub, where they can be pulled easily. This should be pretty simple to set up, given that you can use github credentials to get a docker hub account IIRC.

This sounds great, though we want to make sure deleting a docker image that is in use doesn't cause any problems to currently-running containers. Our management plan should exploit the fact that docker containers should be "short running" jobs that are only run through the queue, with a execution time limit of a few days before destruction.

@tatarsky
Copy link
Contributor Author

tatarsky commented Oct 9, 2015

Also note, from a disk space point of view the current space consumed on /scratch/docker is noise. No local size of those dirs exceeds 100GB. One node has 93GB. So discussion of the methods of cleaning is not time critical. Most node local drives are very lightly loaded up.

@akahles
Copy link

akahles commented Oct 9, 2015

@akahles, @karalets, @jchodera : any preference?

No preference from my side. If the space does not get tight on /scratch I am fine with very spaced cleanup intervals (e.g., every two months). Ideally only unused images that are also available on docker hub are removed, but that might be difficult to check.

@tatarsky
Copy link
Contributor Author

So basically having monitored this if we proceeded with:

  1. Warn Docker users to check in specific versions of their images to dockerhub before some date
  2. Delete docker images older than say four weeks after validating those images are not running
  3. Possibly warn folks that not having a Hal or Git handle in the name of the image will result in its deletion more often (not 100% sure how I would enforce this)

@lzamparo
Copy link

I'm fine with this. Maybe to help the check-in process along, we could link to instructions for how to sign up for docker hub & upload images (or docker files)?

@tatarsky
Copy link
Contributor Author

My suggestion is a section in the wiki with said instructions. And point people at the wiki. Its more persistent.

@tatarsky
Copy link
Contributor Author

@lzamparo misc item to assist with another cleaning script concept. Can you double confirm this docker image on gpu-1-11 is an orphan? My script believes it is but I do not auto-kill yet.

docker ps
CONTAINER ID      
c8a46359b580       (removed uid but its related to you)

@lzamparo
Copy link

@tatarsky I didn't think I had any running jobs, so this job is an orphan. I attached to it and killed it myself.

@tatarsky
Copy link
Contributor Author

Thanks script agreed but I like to check.

@tatarsky
Copy link
Contributor Author

This remains a desire for the docker cleaning script which I may get some cycles on shortly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants