Multi-Stage Builds: Minimal Production Docker Images
Learn how multi-stage builds dramatically reduce image sizes by separating build-time and runtime dependencies, resulting in faster deployments and smaller attack surfaces.
Multi-Stage Builds: Minimal Production Docker Images
Multi-stage builds solve a real problem: before them, you had to choose between build convenience and production image cleanliness. Now you get both.
This tutorial walks through the problem multi-stage builds solve, the mechanics of how they work, and practical examples for common languages and frameworks.
Introduction
Multi-stage builds are not always the right choice. Understanding when they help—and when they add unnecessary complexity—prevents over-engineering.
Use multi-stage builds when:
- Your application requires compilation or transformation steps (build tools, compilers, dependency resolution)
- You want to keep production images minimal and free of build artifacts
- Security and attack surface reduction are priorities
- You deploy frequently and image size affects deployment speed
- You need to run as non-root but build as root
- Your final image runs in constrained environments (Kubernetes with resource limits, edge devices)
Skip multi-stage builds when:
- Your application is a simple script or single file that runs directly with an interpreter
- You use a pre-built image with all dependencies already included
- Build time is not a concern and image size does not matter (internal tools, one-off scripts)
- You are prototyping and simplicity matters more than optimization
For example, a simple Python script that only uses the standard library can run directly from python:alpine. A Node.js API with TypeScript compilation, bundling, and multiple npm dependencies benefits greatly from multi-stage builds.
The Problem with Monolithic Images
Traditional Dockerfiles include everything needed to build and run your application in a single image:
FROM node:20
WORKDIR /app
COPY . .
RUN npm install
RUN npm run build
CMD ["node", "dist/server.js"]
This Dockerfile works. But the resulting image contains the entire Node.js build toolchain, your source code, all npm packages including devDependencies, and the build output. The image might be 1.2GB when your production runtime only needs 150MB.
The problems compound as you iterate:
- Slow deployments: Pushing and pulling large images takes forever
- Large attack surface: Your production image contains compilers, shell access, build tools
- Security vulnerabilities: The build tools and dependencies may have CVEs that do not affect runtime
- Cache inefficiency: Changing a line of code invalidates the build cache for everything
Multi-Stage Build Anatomy
Multi-stage builds use multiple FROM statements. Each FROM starts a fresh build stage. You copy only what you need from each stage into the final image.
# Stage 1: Build
FROM node:20 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
# Stage 2: Production
FROM node:20-alpine
WORKDIR /app
# Copy only the built artifacts and production deps
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY package*.json ./
USER node
CMD ["node", "dist/server.js"]
The AS builder names the first stage so you can reference it later. The --from=builder flag tells Docker to copy from that stage’s filesystem, discarding everything else.
Two-Stage Build Flow
flow TB
subgraph Builder["Builder Stage (node:20)"]
B1[Copy Source Code]
B2[Install Dependencies]
B3[Compile / Build]
B4[Build Output Ready]
B1 --> B2 --> B3 --> B4
end
subgraph Runtime["Runtime Stage (node:20-alpine)"]
R1[Fresh Minimal Base]
R2[Copy Build Artifacts]
R3[Install Production Dependencies]
R4[Set Non-Root User]
R1 --> R2 --> R3 --> R4
end
Builder --> |COPY --from=builder| Runtime
Runtime --> FinalImage["Final Image<br/>~130MB"]
What Gets Discarded
The final image contains only:
- The Alpine-based runtime (7MB vs 1.2GB for the full Node image)
- The built application code (
dist/) - The production node_modules
- No compiler, no source code, no build tools, no shell
Choosing Base Images
The base image you choose for your runtime stage sets the foundation for your production image size.
Base Image Options
| Image | Size | What You Get |
|---|---|---|
node:20 | ~1.2GB | Full Node.js with npm, shell, build tools |
node:20-slim | ~150MB | Node.js with npm, minimal packages |
node:20-alpine | ~130MB | Node.js with apk, musl libc |
node:20-distroless | ~80MB | Node.js only, no shell |
scratch | 0MB | Nothing, you provide everything |
For most applications, node:20-alpine hits the sweet spot: small size, musl libc compatibility, and a package manager for emergencies.
When to Use Distroless
Distroless images contain only the runtime and application. No shell, no package manager, no ability to exec into the container.
FROM node:20 AS builder
# ... build steps ...
FROM gcr.io/distroless/nodejs20-debian11
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY package*.json ./
USER nonroot
CMD ["dist/server.js"]
The tradeoff: you cannot debug by exec’ing into the container. Build comprehensive logging and monitoring into your application before going this route.
Copying Artifacts Between Stages
The COPY --from= instruction supports several ways to reference source content:
Copy from a Named Stage
COPY --from=builder /app/dist ./dist
Copy from a Numbered Stage
Stages are numbered starting at 0:
# Stage 0
FROM node:20 AS builder
# ...
# Stage 1
FROM node:20-alpine AS production
# ...
# Stage 2
FROM nginx:alpine
COPY --from=1 /app/dist ./usr/share/nginx/html
Copy from an External Image
You do not need to build an image in the same Dockerfile to copy from it:
# Extract just the binary from a go image
COPY --from=golang:1.22 /usr/local/bin/hello /usr/local/bin/
This is useful when you want to use an external tool during build without carrying it into your final image.
Copy with Ownership Change
When copying from a builder stage running as root to a production stage using a non-root user:
COPY --from=builder --chown=node:node /app/dist ./dist
This ensures the production user can read the files.
Real-World Examples
Node.js Application
# Build stage
FROM node:20-alpine AS builder
WORKDIR /app
# Install all dependencies (including devDependencies for build tools)
COPY package*.json ./
RUN npm ci
# Copy source and build
COPY src ./src
COPY public ./public
RUN npm run build
# Production stage
FROM node:20-alpine AS production
WORKDIR /app
# Copy package files first for better cache
COPY package*.json ./
# Install production dependencies only
RUN npm ci --only=production && npm cache clean --force
# Copy built application from builder stage
COPY --from=builder --chown=node:node /app/dist ./dist
COPY --from=builder --chown=node:node /app/public ./public
# Create non-root user
RUN addgroup -g 1001 -S nodejs && \
adduser -S nodejs -u 1001 -G nodejs
USER nodejs
EXPOSE 3000
# Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
CMD node -e "require('http').get('http://localhost:3000/health', (r) => process.exit(r.statusCode === 200 ? 0 : 1))"
CMD ["node", "dist/server.js"]
Key optimizations in this Dockerfile:
- All dependencies installed in builder stage (including devDependencies for build)
- Production stage installs only
productiondependencies - npm cache is cleared after install
- Built files are copied with correct ownership
- Non-root user for security
- Health check for orchestrator integration
Go Application
Go compiles to a static binary, which makes it ideal for scratch-based images:
# Build stage
FROM golang:1.22-alpine AS builder
# Install git for go modules
RUN apk add --no-cache git
WORKDIR /app
# Copy go mod files first for dependency caching
COPY go.mod go.sum ./
RUN go mod download
# Copy source and build
COPY . .
# CGO_ENABLED=0 for static binary, no need for c libraries
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -ldflags "-w -s" -o myapp .
# Production stage - just the binary
FROM alpine:3.19
WORKDIR /app
# Add CA certificates for HTTPS, create user
RUN apk add --no-cache ca-certificates && \
addgroup -S appgroup && \
adduser -S appuser -u 1001 -G appgroup
# Copy binary from builder
COPY --from=builder /app/myapp .
# Switch to non-root user
USER appuser
# No ENTRYPOINT or CMD here - use explicit executable path
EXPOSE 8080
ENTRYPOINT ["./myapp"]
The final image is around 15MB: Alpine base plus the static Go binary. No Go runtime, no git, no source.
Python Application
Python applications typically need more runtime dependencies than Go, but multi-stage builds still help:
# Build stage
FROM python:3.12-slim AS builder
WORKDIR /app
# Install build dependencies
RUN apt-get update && \
apt-get install -y --no-install-recommends \
gcc \
libpq-dev \
&& rm -rf /var/lib/apt/lists/*
# Install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir --prefix=/install -r requirements.txt
# Production stage
FROM python:3.12-slim
WORKDIR /app
# Copy installed packages from builder
COPY --from=builder /install /usr/local
# Copy application code
COPY app ./app
COPY gunicorn.conf.py .
# Create non-root user
RUN useradd --create-home appuser && \
chown -R appuser:appuser /app
USER appuser
EXPOSE 8000
# Use gunicorn as the application server
CMD ["gunicorn", "--config", "gunicorn.conf.py", "app:app"]
Rust Application
Rust produces static binaries with some caveats around musl libc and OpenSSL:
# Build stage
FROM rust:1.77-alpine AS builder
# Install build dependencies
RUN apk add --no-cache \
musl-dev \
pkgconfig \
openssl-dev \
openssl-lib-static
WORKDIR /app
# Copy manifests first for dependency caching
COPY Cargo.toml Cargo.lock ./
RUN mkdir src && echo "fn main() {}" > src/main.rs
RUN cargo build --release && rm -rf src
# Copy actual source
COPY src ./src
COPY . .
RUN cargo build --release
# Production stage
FROM alpine:3.19
WORKDIR /app
# Install runtime dependencies
RUN apk add --no-cache ca-certificates
# Copy binary
COPY --from=builder /app/target/release/myapp /usr/local/bin/
# Create user
RUN addgroup -S appgroup && \
adduser -S appuser -u 1001 -G appgroup
USER appuser
EXPOSE 8080
CMD ["myapp"]
Common Pitfalls / Anti-Patterns
Forgetting to Install Production Dependencies
If you copy the entire node_modules from builder, you include devDependencies. Your production image is larger and may have security vulnerabilities that do not affect production:
# Wrong: copies everything from builder
COPY --from=builder /app .
# Right: install production only in final stage
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules # Still includes devDeps
The correct approach for Node.js is to run npm ci --only=production in the production stage.
Copying Unnecessary Files
Be explicit about what you copy. Do not copy the entire working directory:
# Wrong
COPY --from=builder /app .
# Right - copy specific directories
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json .
Not Using the Correct Architecture for Build
If you build on an amd64 host but deploy to arm64 (or vice versa), use Docker buildx for cross-platform builds:
# Enable buildx
docker buildx create --use
# Build for multiple platforms
docker buildx build \
--platform linux/amd64,linux/arm64 \
--tag myapp:1.0.0 \
--push \
.
Buildx creates manifest lists so Docker automatically pulls the right image for each platform.
Performance Impact
Multi-stage builds affect both build time and deployment time.
Build Time
First build takes longer because you run all build steps. Subsequent builds use cache efficiently if you order instructions correctly.
Cached build stages also speed up parallel development: if only your source code changes (not dependencies), the dependency installation layer is cached.
Deployment Time
Image size directly affects:
- Push time: Network transfer to registry
- Pull time: Network transfer from registry to host
- Startup time: Image layers must be downloaded and extracted
A 1.2GB image might take 2 minutes to pull over a fast connection. A 150MB image takes 15 seconds.
For frequent deployments or auto-scaling scenarios, this difference is substantial.
Benchmark: Image Sizes for a Typical Node.js Application
| Base Image | Layer Size | Final Image Size | Notes |
|---|---|---|---|
node:20 (monolithic) | ~1.2GB | 1.2GB | Full build tools included |
node:20-slim | ~150MB | 150MB | No compiler, smaller libc |
node:20-alpine | ~130MB | 130MB | musl libc, compact package manager |
node:20-distroless | ~80MB | 80MB | No shell, minimal attack surface |
| Multi-stage (node:20 + alpine) | builder ~1GB, runtime ~15MB layer | ~145MB | ~90% size reduction |
Typical Build Time Differences
| Scenario | First Build | Cached Build (code only) | Cached Build (deps changed) |
|---|---|---|---|
| Monolithic node:20 | 3m 20s | 2m 45s | 3m 15s |
| Multi-stage | 3m 40s | 25s | 2m 50s |
Multi-stage builds add slight overhead on first build but dramatically speed up iterative development. When only source code changes, the cached dependency layer is reused and only the build step runs.
Production Failure Scenarios
Multi-stage builds introduce their own failure modes. Understanding these helps you debug issues when they arise.
Build Cache Invalidation Causing Full Rebuild
Docker caches each layer. When a layer changes, all subsequent layers are invalidated. When dependency versions in package.json change, the entire dependency installation layer is invalidated even if your code did not change.
Base image updates also invalidate the cache. If node:20-alpine is updated on Docker Hub, every layer built on top of it needs rebuilding.
Mitigation: Use docker build --no-cache periodically to force fresh builds, and consider using BuildKit with build-time content hashing.
Cross-Architecture Build Failures
Building on one architecture and running on another causes binary incompatibility. The error appears at runtime: exec format error: /myapp: cannot execute binary file.
Mitigation: Use Docker buildx for cross-platform builds:
docker buildx create --use
docker buildx build --platform linux/amd64,linux/arm64 -t myapp:multiarch --push .
Binary Incompatibility Between Builder and Runtime
Alpine Linux uses musl libc instead of glibc. Native Node modules with C++ addons compiled against glibc fail at runtime on Alpine. When using Alpine as your runtime, either use Alpine for the builder too, or use node:<version>-slim (Debian-based).
| Builder Image | Runtime Image | Issue |
|---|---|---|
node:20 (glibc) | node:20-alpine (musl) | Native modules may fail at runtime |
golang:alpine | scratch | Static binaries work, but certificate paths differ |
Security Implications
Smaller images have smaller attack surfaces. An Alpine base image has far fewer installed packages than a full Ubuntu image. A distroless image has no shell at all.
But multi-stage builds also help even when using larger base images, because build tools and compilers are not in the final image:
# Scan the full image (everything in builder)
docker scan myapp:monolithic
# Scan the multi-stage image (only runtime)
docker scan myapp:multistage
The multi-stage image will show far fewer vulnerabilities because it simply does not include the vulnerable packages.
Security and Compliance Checklist
Use this checklist when deploying multi-stage builds to production environments:
-
Do not include secrets in build args: Build arguments (
ARG) are persisted in image layers and visible in image history. Never pass passwords, API keys, or tokens throughARG. Use Docker secrets for Swarm services or external secret injection at runtime.# Wrong — secret visible in image history ARG API_KEY=my-secret-key # Right — inject at runtime via environment ENV API_KEY=${API_KEY} -
Use specific image tags, not
latest: Floating tags likenode:20-alpinecan change. Pin to specific versions for reproducible builds. -
Scan final image only, not builder: Security scans should target the production runtime image. The builder stage contains build tools that may have vulnerabilities that do not affect runtime.
-
Run as non-root: Always create and switch to a non-root user in the final stage.
-
Read-only filesystem where possible: Combine with tmpfs mounts for writable space.
-
No shell access in production: Use distroless or scratch base images when exec access is not needed.
Trade-off Analysis
Monolithic vs Multi-Stage Build Trade-offs
| Factor | Monolithic Image | Multi-Stage Build |
|---|---|---|
| Image size | Large (full build toolchain) | Minimal (runtime only) |
| Build time | Faster first build | Slightly longer first build |
| Iterative build time | Slower (cache invalidation) | Faster (dependency cache hit) |
| Security surface | Larger (compilers, shell) | Smaller (runtime only) |
| Debug capability | Full shell access | Limited or none |
| Cross-compilation | Native only | buildx enables multi-platform |
| Complexity | Simple Dockerfile | More complex |
| Cache efficiency | Poor (all layers coupled) | Good (stage isolation) |
When monolithic wins: Local development where build speed matters more than image size. One-off tools where you need shell access. Situations where complexity is not justified by scale.
When multi-stage wins: Production deployments where image size, security, and deployment speed matter. Frequent deployments where cache efficiency compounds. Constrained environments (Kubernetes with resource limits, edge devices).
Runtime Base Image Trade-offs
| Image Type | Size | Security | Debug Access | Compatibility |
|---|---|---|---|---|
alpine | ~5MB | Good | Via apk | musl libc |
slim (Debian) | ~50MB | Moderate | Via apt | glibc |
distroless | ~20MB | Excellent | None | glibc |
scratch | 0MB | Maximum | None | Static binaries |
Alpine is the most popular choice for general use. Distroless or scratch are better when security is paramount and you do not need shell access.
Interview Questions
Multi-stage builds solve the problem of keeping production images minimal while using full build toolchains during compilation. Before multi-stage builds, you had to choose between two bad options: either ship a large monolithic image with build tools included, or do complex build scripts to copy artifacts between images manually. Multi-stage builds provide a clean mechanism to discard build-time dependencies while retaining only runtime artifacts in the final image.
Use the `--from=stage-name` flag with the COPY instruction. Name stages with `AS stage-name` and reference them later with `COPY --from=builder /app/dist ./dist`. You can also reference stages by number (0-indexed): `COPY --from=0 /app/dist ./dist`. For copying from external images not built in the same Dockerfile, use the image name directly: `COPY --from=golang:1.22 /usr/local/bin/myapp /usr/local/bin/`.
The `latest` tag is floating and changes over time. A Dockerfile that builds successfully today might produce a different image tomorrow if the `latest` tag is updated. This breaks reproducibility — the same Dockerfile no longer produces the same result. For production, always pin to specific version tags (like `node:20-alpine3.19` or `golang:1.22-alpine`) so builds are deterministic and you can audit exactly which version is deployed.
Copying everything (`COPY --from=builder /app .`) includes source code, build tools, devDependencies, and all artifacts. Copying specific paths (`COPY --from=builder /app/dist ./dist` and `COPY --from=builder /app/node_modules ./node_modules`) only includes what the runtime needs. The difference is image size (devDependencies can be huge), security (source code and build tools are removed), and cache efficiency (artifact changes do not invalidate unrelated layers).
Alpine uses musl libc while Debian-based images use glibc. Native Node modules with C++ addons compiled against glibc fail at runtime on Alpine with errors like "exec format error" or missing symbols. When using Alpine as a runtime base, either compile native modules in an Alpine builder stage, or use `-slim` (Debian-based) images for both builder and runtime. Go static binaries work across both because they do not link against libc at runtime.
Never pass secrets as build arguments (ARG) or environment variables because they are visible in image layers and image history. Instead, use BuildKit's secret mounting feature: `--mount=type=secret,id=mysecret` makes the secret file available during build but does not persist it in layers. For runtime secrets, inject via environment variables or mounted files from a secrets manager after the image is built. At runtime, use Docker secrets (Swarm) or Kubernetes secrets, not build-time injection.
Docker caches each layer. When a layer changes, all subsequent layers are invalidated. To maximize cache hits, order instructions from least to most volatile: base image, system dependencies, package manager dependencies (npm, pip, go mod), source code, build commands. When source code changes, only the final layers rebuild — dependency installation is cached. Also use specific version tags rather than `latest` so base image updates do not invalidate caches unexpectedly.
Use distroless or scratch when security is the highest priority and you do not need shell access for debugging. Distroless images have no shell, no package manager, and no ability to exec into the container — this minimizes the attack surface to near zero. Scratch is even more minimal: just your static binary with nothing else. These are appropriate for production workloads where you have comprehensive logging and monitoring built into the application, and where the security benefit outweighs the loss of emergency debugging capability.
Smaller images deploy faster because less data transfers over the network. A 1.2GB image might take 2 minutes to pull over a fast connection; a 150MB image takes 15 seconds. For frequent deployments or auto-scaling scenarios, this difference is substantial — faster pull times mean faster scale-out. Registry storage costs also decrease significantly. A team deploying 10 times per day saves 10GB+ of registry storage per day just by using multi-stage builds instead of monolithic images.
Create the non-root user in the final stage with appropriate group and UID. For Alpine: `RUN addgroup -g 1001 -S nodejs && adduser -S nodejs -u 1001 -G nodejs`. For Debian-based: `RUN useradd --create-home appuser`. Then switch to that user with `USER nodejs` or `USER appuser`. When copying artifacts from the builder stage, use `--chown=user:group` to set correct ownership so the non-root user can read the files: `COPY --from=builder --chown=nodejs:nodejs /app/dist ./dist`.
BuildKit is Docker's next-generation build backend that provides parallel build execution, better caching, and content-aware build caching. When you have multiple build stages that do not depend on each other, BuildKit executes them in parallel rather than sequentially, reducing total build time. BuildKit also uses content hashing for layer caching—only layers whose content actually changed get invalidated, not layers that just happened to change in the same build. Enable BuildKit by setting `DOCKER_BUILDKIT=1` or by running `docker buildx` commands.
Native Node.js modules with C++ addons compiled against glibc fail at runtime on Alpine (musl libc). The solution is to ensure your builder stage uses the same libc as your runtime stage. If you use Alpine in the runtime stage, compile native modules in an Alpine builder stage too. Alternatively, use Debian-based slim images for both stages. For Go and Rust, static binaries work across both because they do not link against libc at runtime. For Python packages with C extensions, install build dependencies in the builder stage and copy the compiled wheel to the runtime stage.
`COPY` is the preferred instruction for copying files from the build context into the image—it does exactly what you expect with no magic. `ADD` has the same basic functionality but also supports extracting compressed files (tar extraction) and downloading from URLs. For most use cases, `COPY` is correct and clearer. Use `ADD` only when you specifically need the tar extraction or URL download behavior. Using `ADD` for simple file copying is considered an anti-pattern because it adds functionality you may not intend.
Order Dockerfile instructions from least to most frequently changed. Base image layers change rarely, dependency installation changes less often than source code, and build commands change most frequently. Put dependency installation before source code copying so that when only source code changes, the dependency layer remains cached. Use `COPY package*.json ./` instead of `COPY . .` to separate dependency installation from source code copying. In multi-stage builds, dependency layers in the builder stage are cached independently from runtime stage layers.
Running as root means if an attacker exploits a vulnerability in your application, they have root access inside the container. From there, they could potentially escape the container (depending on the runtime configuration), access secrets mounted into the container, or modify files. Use a non-root user in your Dockerfile: create the user with `adduser -S nodejs -u 1001` and switch with `USER nodejs`. Note that being non-root inside the container does not prevent the container from having elevated capabilities on the host—use `cap_drop: ALL` to drop all capabilities and `read_only: true` for a read-only root filesystem.
Build arguments are persisted in image layers and visible in image history—never pass secrets through ARG. Use ARG for build-time configuration that does not need to be secret: base image version tags, compilation flags, feature toggles. For secrets, use BuildKit's secret mounting: `--mount=type=secret,id=mysecret` makes the secret available during build without persisting it. Alternatively, inject secrets at runtime via environment variables from a secrets manager, not during the build stage. Any ARG used during build that contains a secret should be documented as a security risk.
Use `scratch` when you have a truly static binary that needs nothing from an OS—typically compiled languages like Go, Rust, or C with static linking. The scratch image is 0 bytes (just your binary). This gives maximum security and minimum image size but provides no shell, no package manager, and no debugging capability. If your application needs any runtime libraries, CA certificates, or shell access, use Alpine or distroless instead. Scratch works for Go applications that do not need TLS certificates from the system store or any OS-level features.
Docker caches each layer. When a layer changes, all subsequent layers are invalidated. In multi-stage builds, each stage has its own layer cache. When source code changes in the builder stage, only builder layers after the source copy are invalidated—the dependency installation layer (cached) is reused. This makes iterative builds fast. However, if your runtime stage copies from the builder stage and the builder stage has invalidated layers, the runtime stage may need to rebuild. Order instructions in both stages to maximize cache hits: install dependencies before copying source code.
Run `docker image ls ` to see the image size. Use `docker history
` to see the size contribution of each layer and instruction. Check the total size of the final image—anything over 200MB for a typical web application suggests optimization opportunity. For the builder stage specifically, `docker build --target builder -t myapp:builder .` lets you inspect just the builder stage. Compare builder stage size to runtime stage size—if builder is 1GB and runtime is 150MB, the multi-stage build is working correctly. Set up CI to fail if image size exceeds a threshold to prevent regression.
The USER directive switches to a non-root user in the final stage. This is a security measure—if your application is compromised, the attacker has limited permissions inside the container. Create the user before USER with appropriate group and UID: `addgroup -g 1001 -S nodejs && adduser -S nodejs -u 1001 -G nodejs`. When copying files from builder to runtime, use `--chown=user:group` to ensure the non-root user can read the copied files. USER is also important because some images default to root, and running as root in a container with host mounting can create files on the host owned by root.
Further Reading
- Docker Multi-Stage Build Documentation - Official multi-stage build reference
- Best Practices for Writing Dockerfiles - Docker’s official Dockerfile guidelines
- Distroless Images - Minimal base images from Google
- Docker Buildx - Multi-platform builds and builders
- Snyk Container Vulnerability Database - Check base image vulnerabilities
- Moby BuildKit - Next-generation build backend
Multi-Stage Build Performance Optimization
Beyond basic image size reduction, multi-stage builds enable several performance optimizations:
Parallel build stages: BuildKit can execute independent build stages in parallel, reducing total build time for Dockerfiles with multiple stages that do not depend on each other.
Layer caching optimization: Ordering Dockerfile instructions from least to most volatile maximizes cache hits. Dependency installation changes less frequently than source code, so installing dependencies before copying source code keeps builds fast.
# Optimized: dependencies change less often than source
COPY package*.json ./
RUN npm ci
COPY src ./src
RUN npm run build
Build arguments for dynamic optimization: Pass build arguments at build time rather than hardcoding values to enable different optimization levels per environment.
Build secrets: Use --mount=type=secret to pass secrets to build stages without persisting them in image layers, avoiding the security risk of secrets visible in image history.
Conclusion
Multi-stage builds separate your build environment from your runtime environment. This lets you use full build toolchains during compilation while shipping minimal images to production.
The pattern is consistent across languages:
- Use a full build image as the first stage
- Build your application
- Copy only what you need to run into a minimal runtime image
- Run as non-root in the production stage
The build complexity overhead is minimal compared to the size and security wins. Every deployment gets faster as a bonus.
For deeper understanding of image optimization, see Container Images: Building, Optimizing, and Distributing. For orchestrating multi-container applications with multi-stage builds, continue to Docker Compose.
Category
Related Posts
Docker Compose: Orchestrating Multi-Container Applications
Define and run multi-container Docker applications using Docker Compose. From local development environments to complex microservice topologies.
Docker Fundamentals: From Images to Production Containers
Master Docker containers, images, Dockerfiles, docker-compose, volumes, and networking. A comprehensive guide for developers getting started with containerization.
Artifact Management: Build Caching, Provenance, and Retention
Manage CI/CD artifacts effectively—build caching for speed, provenance tracking for security, and retention policies for cost control.