Relationship between image and layer of docker

Docker supports a variety of graphDriver, including vfs, devicemapper, overlay, overlay2, aufs, etc. the most commonly used one is aufs, but with the linux kernel 3.18 including overlay, the status of overlay has become more important. Currently, the default storage type of docker is overlay2, and the docker version is 1.8, as follows

The default storage directory of docker is / var/lib/docker. Let's simply print this directory:

[root@docker2 ~]# ll /var/lib/docker
//Total consumption 24
drwx------.   2 root root    24 5 month  15 2019 builder
drwx------.   4 root root    92 5 month  15 2019 buildkit
drwx------.   3 root root    78 3 month   8 11:14 containers
drwx------.   3 root root    22 5 month  15 2019 image
drwxr-x---.   3 root root    19 5 month  15 2019 network
drwx------. 165 root root 16384 3 month   8 11:14 overlay2
drwx------.   4 root root    32 5 month  15 2019 plugins
drwx------    2 root root     6 3 month   8 11:10 runtimes
drwx------.   4 root root    83 3 month   8 11:10 swarm
drwx------    2 root root     6 3 month   8 11:10 tmp
drwx------.   2 root root     6 5 month  15 2019 trust
drwx------.  21 root root  4096 8 month  11 2019 volumes

Only image and overlay2 are concerned. Image: mainly stores the metadata of the layer and the specific information of each layer in the image.
Before doing this experiment, we should start a container, and use nginx as the experiment here:

[root@docker2 ~]# docker ps 
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                NAMES
88984d1d86a9        nginx               "nginx -g 'daemon of..."   45 hours ago        Up 5 hours>80/tcp   nginx              "nginx -g 'daemon of..."   4 seconds ago       Up 3 seconds        80/tcp              practical_vaughan

You can see that the id of the newly launched nginx container is 88984d1d86a9. Let's go on.

As mentioned above, we only need to care about / var/lib/docker/image and / var/lib/docker/overlay2. You can print it at / var/lib/docker/image first:

[root@docker2 ~]# cd /var/lib/docker/image/
[root@docker2 image]# ll
//Total dosage 0
drwx------. 5 root root 81 3 month   6 19:29 overlay2

We can only see the directory of overlay2. Docker will create a directory for the used storage driver in the directory of / var/lib/docker/image, such as overlay2 here.
Next, use the tree command to browse the directory:

[root@docker2 image]# tree -L 2 overlay2/
├── distribution      
│   ├── diffid-by-digest
│   └── v2metadata-by-diffid
│   ├── content
│   └── metadata
│   ├── mounts
│   ├── sha256
│   └── tmp
└── repositories.json

The key place here is the imagedb and layerdb directories. If you look at the directory name, it's obvious that it's specifically used to store metadata. Why distinguish image and layer? Because in docker, image is composed of multiple layers. In other words, the layer is a shared layer, and multiple images may point to a certain layer.
How to confirm which layer s are included in the image? The answer is in the imagedb directory. For example, for the nginx container launched above, we can find the corresponding image of the container first:

[root@docker2 image]# docker images nginx
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
nginx               latest              6678c7c2e56c        3 days ago          127MB
nginx               1.13.7-alpine       22f5726c6dc0        2 years ago         15.5MB

As you can see, the image id is 6678c7c2e56c. Remember this id again. We print the directory / var/lib/docker/image/overlay2/imagedb/content/sha256:

[root@docker2 sha256]# ll  |grep 6678c7c2e56c
-rw-------  1 root root  6666 3 month   6 19:29 6678c7c2e56c970388f8d5a398aa30f2ab60e85f20165e101053c3d3a11e6663

In the first line, 6678c7c2e56c970388f8d5a398aa30f2ab60e85f20165e101053c3d3a11e6663 is the file that records our nginx image metadata. Next, cat will use this file to get a long json:

[root@docker2 sha256]# cat 6678c7c2e56c970388f8d5a398aa30f2ab60e85f20165e101053c3d3a11e6663  |python  -mjson.tool
. . . . . . 
    "rootfs": {
        "diff_ids": [
        "type": "layers"

Because of the length, I only show the most critical part, which is rootfs. It can be seen that the diff_ids of rootfs is an array containing three elements. In fact, these three elements are the three layer IDs that make up the nginx image. From the top to the bottom, that is to say, f2cb0ecef392f2a630fa1205b874ab2e2e2aedf96de04d0b8838e4e728e28142da is the lowest layer of the image. Now that we have all the layer IDs that make up this image, we can take these layer IDs to find the corresponding layer.
Next, we return to the layerdb of the previous layer, and print the directory first:

[root@docker2 layerdb]# ll
//Total consumption 20
drwxr-xr-x.   3 root root    78 3 month   8 11:14 mounts
drwxr-xr-x. 162 root root 16384 3 month   6 19:29 sha256
drwxr-xr-x.   2 root root     6 3 month   6 19:29 tmp

Here, we only have two directories, mount and sha256, and print the sha256 Directory:

[root@docker2 layerdb]# ll sha256/ |grep f2cb0ecef392f2a630fa1205b874ab2e2aedf96de04d0b8838e4e728e28142da
drwx------  2 root root 71 3 month   6 19:27 f2cb0ecef392f2a630fa1205b874ab2e2aedf96de04d0b8838e4e728e28142da

Here, we only find f2cb0eef392f2a630fa1205b874ab2e2aedf96de04d0b8838e4e728e28142da, which is the lowest layer. Why are the remaining two layers absent? That's because docker uses the chainID method to save these layers. In short, chainID=sha256sum(H(chainID) diffid), that is, f2cb0ecef392f2a630fa1205b874ab2e2e2e2e2aedf96de04d0b8838e4e728e28142da

[root@docker2 sha256]#  echo -n "sha256:f2cb0ecef392f2a630fa1205b874ab2e2aedf96de04d0b8838e4e728e28142da sha256:71f2244bc14dacf7f73128b4b89b1318f41a9421dffc008c2ba91bb6dc2716f1" |sha256sum -
1541955a517830d061b79f2b52b1aed297d81c009ce7766a15527979b6e719c4  -

At this time, you can see the directory of the layer f2cb0eef392f2a630fa1205b874ab2e2aedf96de04d0b8838e4e728e28142da, right? And so on, we can find all the combinations of layer IDs.
But we also said above, / var/lib/docker/image/overlay2/layerdb only stores metadata, so where does the real rootfs exist? Cache ID is the key. Let's print / var/lib/docker/image/overlay2/layerdb / sha256 / f2cb0ecef392f2a630fa1205b874ab2ae2aedf96de04d0b8838e4e728e28142da / cache-id:

[root@docker2 layerdb]# cat sha256/1541955a517830d061b79f2b52b1aed297d81c009ce7766a15527979b6e719c4/cache-id 

Yes, this id corresponds to / var / lib / docker / overlay 2 / f77d281af55651a70e5fc8f31de840d5b5461f36d930545db39f01bc839e4097. Therefore, by analogy, the corresponding rootfs can also be found for the cache id corresponding to the higher layer layer. When these diff directories of rootfs are mounted to a directory through joint mounting, the rootfs needed for the entire container can be completed.

[root@docker2 overlay2]# ll f77d281af55651a70e5fc8f31de840d5b5461f36d930545db39f01bc839e4097

Total dosage 8

Drwx R-X 7 root 61 March 6 19:29 diff

-Rw-r -- R -- 1 root 26 March 6 19:29 link

-Rw-r -- R -- 1 root 28 March 6 19:29 lower

drwx------ 2 root root 6 March 6 19:29 work

Tags: Operation & Maintenance Docker Nginx JSON Linux

Posted on Sun, 08 Mar 2020 01:17:53 -0800 by ehutchison