So besides how great it is to be able to just pull down a docker image, theres actually a bit more advanced things you can do in terms of manipulating an image. The following points will give you a better understanding of how to work with, create, and modify images for your own projects:)
The two ways to get an image…
1. A registry. A docker registry (i.e. registry.hub.docker.com) allows you to easily pull an entire image locally for utilizing to create other images or just start a container. This is the simplest way to get up and running quickly. If you are going to be doing a lot with images, especially creating your own, its a good idea to consider running an internal registry (just search for “docker-registry” containers to help you get started). There are also sites like Quay.io which allow you to run your own private registry:)
2. Build files/bundle. A build bundle is merely a tarball or repo of all dependencies to build an image. This can include merely a Dockerfile, or have extra source code to be build, run scripts, or anything else the image needs to contain for its particular purpose. When you pull down a build repo, you merely enter it and run
docker build -t myimage .
Creating an image from an existing image
Most people, at some point in time, will want to take an existing image and build they’re own based on it. There are a couple ways to do this:
1. Start building your own build directory and start the Dockerfile with a “FROM user/imagerepo” statement.
2. Starting a container, making desired changes, and committing those changes to an image.
For #1, this is fairly straightforward. When you build your image, it will inherit a parent images’ layers and then run through the rest of your dockerfile. For #2, however, this is a bit more tricky. Basically, you start a conatiner based off an image, lets say ubuntu:lastest, make your desired modifications in that container, exit, and then commit those changes creating a new container!
ubuntu -> Container (make changes) -> save changes to new image, myimage
What that looks like is this:
$ sudo docker run -t -i tatum/gentoo-stage3 /bin/bash
root@0b2616b0e5a8:/# echo "iptables-restore < /etc/iptables.conf" > /etc/rc.local
$ sudo docker commit -m="load iptables at boot" -a="Jon Doe" 0b2616b0e5a8 jdoe/gentoo-stage3:v2
See, its actually really simple. Notice the id we use for the commit is just the id of the container we made changes in:)
Whenever you see something like “ubuntu:14.04″, thats merely a mark defining that the commit for that image, ubuntu. In our above example, we use a :v2 to designate that the image is not like the original one.
That’s really all it is! Just like tags in repositories, it merely tags a certain commit to make it stand out for whatever reason (version, special feature, etc..)
Tags in no way define how images are built off of one another, or anything along those lines.
Building images off of images: Understanding “base”
I recently got into some discussion with co-workers about progression of our containers and how to keep things fresh.
Lets say we have an image, “docker-gentoo”, which is just a basic gentoo install. The following lists images, each built from the one above it, that we create to aid users in starting at a certain point, for whatever purpose:
docker-gentoo-ssh (FROM docker-gentoo)
docker-gentoo-cron (FROM docker-gentoo-ssh)
docker-gentoo-haproxy (FROM docker-gentoo-cron)
So this is all fine and all. Now lets ask some questions:
Q1. What happens if we re-build the docker-gentoo image with a fresh stage3 tarball, and push it up to the registry….when docker-gentoo-cron is pulled down from the registry, does it automatically inherit those changes?
Q2. What if I remove docker-gentoo-cron from the registry? Will docker-gentoo-haproxy be broken when a user goes to pull it down?
Here are the answers, thanks to some cool dudes over @ #docker on freenode:
A1. When you push that original docker-gentoo image to the registry, the id of the topmost layer is saved. So all subsequent images using docker-gentoo (i.e. docker-gentoo-ssh) start from that ID (i.e. 0×9) and continue on when they are pushed to the registry. That means that when you “docker push docker-gentoo-ssh” to the registry, a network request is made for each layer of the image to compare it with what already exists in the registry. So the registry will figure out that it should only store the new layers since it already has the original ones (docker-gentoo).
Now you make your changes to docker-gentoo, push them (new ID 0×12), and then another user pulls down docker-gentoo-ssh shortly thereafter. This other user will not get a “upated” image since the registry will remember that original ID (i.e. 0×9, when the image was initially pushed) and serve the original layers that make up docker-gentoo-ssh. The only way to “update” docker-gentoo-ssh to inherit the new docker-gentoo, is to rebuild it and then re-push it to the registry. This also means, that you would work your way up the tree (-ssh, -cron, -haproxy) gradually rebuilding each image and re-pushing it to the registry in order to update all of them.
A2. No, when you remove docker-gentoo-cron from the registry, it will not remove its layers since docker-gentoo-haproxy still relies on them.
Hopefully you understand a bit better how images can be tamed:) Below is just an excert from my irc convo:
12:51 Sinjek : if ubuntu updates its /bin/bash due to shellshock, every single thing that said FROM ubuntu has to be rebuild. They're all vulnerable.
12:51 InAnimaT : but how does it know if you're just doing FROM ubuntu?
12:52 Sinjek : Because people don't download dockerfiles.
12:52 InAnimaT : does it keep track of the ID from which your image derived from
12:52 Sinjek : Yup
12:52 Sinjek : since each layer is a filesystem diff it has to do it that way; it's not smart enough to merge your layer onto a new layer
12:53 Sinjek : Docker automated builds / trusted builds fix that a little
12:53 Sinjek : there, you give dockerhub your Dockerfile and it rebuilds it when your FROM changes for you.