Multi stage and multi-stage construction in dockerfile

In the process of software development with container technology, it is time-consuming and laborious to control the size of container image. If the image we build is not only the environment of compiling software, but also the final running environment of software, it is difficult to control the image size. Therefore, the common configuration mode is to provide different container images for the compiling environment and the running environment of the software. For example, to provide a Dockerfile.build for the compilation environment, the image built with it contains all the contents needed for compiling the software, such as code, SDK, tools, etc. At the same time, it provides a separate Dockerfile for the running environment of the software. It obtains the compiled software from the Dockerfile.build, and the image built with it only contains the content necessary for running the software. This situation is called builder pattern. This article will introduce how to solve the problem brought by builder pattern through multi stage in Dockerfile.

Common container image building process

For example, we created a GO language and wrote a program app.GO to check the hyperlink in the page sparkdev Get the code related to this article):

package main

import (
    "encoding/json"
    "fmt"
    "log"
    "net/http"
    "net/url"
    "os"
    "strings"

    "golang.org/x/net/html"
)

type scrapeDataStore struct {
    Internal int `json:"internal"`
    External int `json:"external"`
}

func isInternal(parsedLink *url.URL, siteUrl *url.URL, link string) bool {
    return parsedLink.Host == siteUrl.Host || strings.Index(link, "#") == 0 || len(parsedLink.Host) == 0
}

func main() {
    urlIn := os.Getenv("url")
    if len(urlIn) == 0 {
        urlIn = "https://www.cnblogs.com/"
        log.Fatalln("Need a valid url as an env-var.")
    }

    siteUrl, parseErr := url.Parse(urlIn)
    if parseErr != nil {
        log.Fatalln(parseErr)
    }
    resp, err := http.Get(urlIn)
    if err != nil {
        log.Fatalln(err)
    }

    scrapeData := &scrapeDataStore{}
    tokenizer := html.NewTokenizer(resp.Body)
    end := false
    for {
        tt := tokenizer.Next()
        switch {
        case tt == html.StartTagToken:
            // fmt.Println(tt)
            token := tokenizer.Token()
            switch token.Data {
            case "a":

                for _, attr := range token.Attr {

                    if attr.Key == "href" {
                        link := attr.Val

                        parsedLink, parseLinkErr := url.Parse(link)
                        if parseLinkErr == nil {
                            if isInternal(parsedLink, siteUrl, link) {
                                scrapeData.Internal++
                            } else {
                                scrapeData.External++
                            }
                        }

                        if parseLinkErr != nil {
                            fmt.Println("Can't parse: " + token.Data)
                        }
                    }
                }
                break
            }
        case tt == html.ErrorToken:
            end = true
            break
        }
        if end {
            break
        }
    }
    data, _ := json.Marshal(&scrapeData)
    fmt.Println(string(data))
}

Let's build it through a container and deploy it to a production container image.
First, build the image of the compiled application:

FROM golang:1.7.3
WORKDIR /go/src/github.com/sparkdevo/href-counter/
RUN go get -d -v golang.org/x/net/html
COPY app.go .
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o app .

Save the above content to the Dockerfile.build file.

Then deploy the built application to the image for production environment:

FROM alpine:latest  
RUN apk --no-cache add ca-certificates
WORKDIR /root/
COPY app .
CMD ["./app"]  

Save the above content to the Dockerfile file.

Finally, you need to use a script to integrate the entire build process:

#!/bin/sh
echo Building sparkdevo/href-counter:build
# Building a mirror of a compiled application
docker build --no-cache -t sparkdevo/href-counter:build . -f Dockerfile.build
# Create application
docker create --name extract sparkdevo/href-counter:build
# Copy compiled application
docker cp extract:/go/src/github.com/sparkdevo/href-counter/app ./app
docker rm -f extract

echo Building sparkdevo/href-counter:latest
# Build an image of the running application
docker build --no-cache -t sparkdevo/href-counter:latest .

Save the above content to the build.sh file. This script will create a container to build the application, and then create the image of the final running application.
Put app.go, Dockerfile.build, Dockerfile and build.sh in the same directory, and then enter the directory to execute the build.sh script to build. Container image size after construction:

From the above figure, we can see that the container image size used to compile the application is close to 700M, while the container image size used for the production environment is only 10.3M, which is very efficient in the transmission between networks.

Run the following command to check whether the container we built works properly:

$ docker run -e url=https://www.cnblogs.com/ sparkdevo/href-counter:latest
$ docker run -e url=http://www.cnblogs.com/sparkdev/ sparkdevo/href-counter:latest

OK, the program we wrote correctly counted the situation of hyperlink in the homepage of blog Park and the author's homepage.

Using the above build process, we need to maintain two Dockerfile files and a script file, build.sh. Can you simplify it? Let's take a look at the solution that docker provides for this situation: multi stage.

Using multi stage in Dockerfile

Multi stage allows us to complete the functions similar to the previous build.sh script in the Dockerfile. Each stage can be understood as building a container image. The later stage can refer to the image created in the previous stage. So we can use the following single Dockerfile file to implement the previous requirements:

FROM golang:1.7.3
WORKDIR /go/src/github.com/sparkdevo/href-counter/
RUN go get -d -v golang.org/x/net/html
COPY app.go    .
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o app .

FROM alpine:latest
RUN apk --no-cache add ca-certificates
WORKDIR /root/
COPY --from=0 /go/src/github.com/sparkdevo/href-counter/app .
CMD ["./app"]

Save the above content to the file Dockerfile.multi. The characteristics of this Dockerfile file are that there are multiple FROM instructions at the same time. Each FROM instruction represents the beginning of a stage. We can COPY the products of one stage to another. The first stage in this example completes the construction of the application. The content is the same as the previous Dockerfile.build. The COPY instruction in the second stage references the first stage through -- from=0, and copies the application to the current stage. Next let's compile the new image:

$ docker build --no-cache -t sparkdevo/href-counter:multi . -f Dockerfile.multi

This time, run the application using the href counter: multi image:

$ docker run -e url=https://www.cnblogs.com/ sparkdevo/href-counter:multi
$ docker run -e url=http://www.cnblogs.com/sparkdev/ sparkdevo/href-counter:multi

The result is the same as before. Is the newly generated image special

Well, as we can see from the above figure, in addition to the sparkdemo / href counter: multi image, an anonymous image is also generated. Therefore, the so-called multi-stage does not go out of date with the syntax sugar of multiple dockerfiles. But this syntax sugar is very attractive. Now we can maintain a Dockerfile file with simple structure!

Use named stage

In the above example, we refer to the first stage in the Dockerfile through -- from=0, which makes the Dockerfile not easy to read. In fact, we can name stage, and then we can reference stage by name. The following is the modified Dockerfile.mult file:

FROM golang:1.7.3 as builder
WORKDIR /go/src/github.com/sparkdevo/href-counter/
RUN go get -d -v golang.org/x/net/html
COPY app.go    .
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o app .

FROM alpine:latest
RUN apk --no-cache add ca-certificates
WORKDIR /root/
COPY --from=builder /go/src/github.com/sparkdevo/href-counter/app .
CMD ["./app"]

We name the first stage builder using as syntax, and then use the name builder to reference -- from=builder in the following stages. By using the named stage, the Dockerfile is easier to read.

summary

Although the multi-stage in Dockerfile is just some syntactic sugar, it does bring us a lot of convenience. In particular, it reduces the burden of the Dockerfile maintainer (you should know that the Dockerfile in actual production is not as simple as that in demo). It should be noted that the old version of docker does not support multi-stage. Only 17.05 and later versions can support multi-stage. OK, is it time to upgrade your docker version?

57 original articles published, 566 praised, 4.97 million visitors+
His message board follow

Tags: Docker github JSON Linux

Posted on Tue, 25 Feb 2020 22:22:32 -0800 by Nicksta