Docker Multi-Stage Builds: Shrink Images by 90%

Your Node.js Image Is Probably 1.2 GB. It Should Be 120 MB.

I recently audited the Docker images for a client running 15 microservices on Kubernetes. Their average image size was 1.4 GB. After applying multi-stage builds and a few other optimizations, the average dropped to 89 MB — a 94% reduction. Their deployment times went from 4 minutes to 45 seconds. Their container registry bill dropped by 70%. Cold starts on new nodes went from 30 seconds to under 5.

Large Docker images are not just a storage problem. They are a deployment speed problem, a security surface problem, and a cost problem. Every layer you ship contains potential CVEs that scanners will flag and your security team will ask about. Every megabyte you push takes bandwidth during deployments and pulls during scaling events. The cumulative impact is enormous, and the fix is almost always straightforward.

Why Images Get Fat

Most oversized Docker images come from one of four causes:

1. Using the Wrong Base Image

The most common mistake is starting from a full OS image when you do not need one:

# Bad: 900+ MB base image
FROM ubuntu:22.04
RUN apt-get update && apt-get install -y nodejs npm
COPY . /app
RUN npm install
CMD ["node", "server.js"]
# Final image: ~1.2 GB

# Better: 340 MB base image  
FROM node:20
COPY . /app
RUN npm install
CMD ["node", "server.js"]
# Final image: ~400 MB

# Best: 50 MB base image
FROM node:20-alpine
COPY . /app
RUN npm install --production
CMD ["node", "server.js"]
# Final image: ~80 MB

The Alpine-based images use musl libc instead of glibc, which occasionally causes compatibility issues with native Node.js addons (bcrypt, sharp, etc.). For those cases, use the slim variant instead:

FROM node:20-slim
# ~200 MB base, glibc-compatible, still much smaller than full node:20

2. Including Build Dependencies in the Runtime Image

This is the problem multi-stage builds solve directly. When you compile a Go binary, you need the Go toolchain. When you build a React app, you need Node.js, npm, and webpack. When you compile a Python C extension, you need gcc and python-dev. None of these belong in your production image.

3. Not Leveraging Layer Caching

Docker builds each instruction as a layer, and layers are cached. If you copy your entire source tree before installing dependencies, changing a single line of code invalidates the dependency installation cache:

# Bad: any code change rebuilds node_modules
COPY . /app
RUN npm install

# Good: dependencies are cached unless package.json changes
COPY package.json package-lock.json /app/
RUN npm install
COPY . /app

4. Forgetting .dockerignore

Without a .dockerignore file, Docker sends everything in your build context to the daemon, including node_modules, .git, test files, documentation, and IDE configuration:

# .dockerignore
node_modules
.git
.gitignore
*.md
docs/
tests/
.env*
.vscode/
.idea/
coverage/
dist/
Dockerfile
docker-compose*.yml

Multi-Stage Builds: The Complete Guide

Multi-stage builds let you use multiple FROM statements in a single Dockerfile. Each FROM starts a new stage, and you can copy artifacts from one stage to another. The final image only contains the layers from the last stage.

Pattern 1: Go Application

Go is the poster child for multi-stage builds because Go compiles to a static binary that needs no runtime:

# Stage 1: Build
FROM golang:1.22-alpine AS builder

WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download

COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -ldflags="-w -s" -o /app/server ./cmd/server

# Stage 2: Runtime
FROM scratch

COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
COPY --from=builder /app/server /server

EXPOSE 8080
ENTRYPOINT ["/server"]

# Build image: ~800 MB (Go toolchain + dependencies)
# Final image: ~8 MB (just the binary + CA certs)

The -ldflags="-w -s" flag strips debug information, reducing the binary size by 20-30%. Building with CGO_ENABLED=0 produces a fully static binary that runs on scratch (an empty base image).

Pattern 2: Node.js Application

# Stage 1: Install dependencies
FROM node:20-alpine AS deps

WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci --production

# Stage 2: Build (if you have a build step)
FROM node:20-alpine AS builder

WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
COPY . .
RUN npm run build

# Stage 3: Runtime
FROM node:20-alpine

RUN addgroup -g 1001 -S nodejs && \
    adduser -S nextjs -u 1001

WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY --from=builder /app/dist ./dist
COPY package.json ./

USER nextjs
EXPOSE 3000
CMD ["node", "dist/server.js"]

# Final image: ~120 MB (Alpine + Node runtime + production deps + built code)

Pattern 3: Python Application

# Stage 1: Build wheel files
FROM python:3.12-slim AS builder

WORKDIR /app
RUN pip install --no-cache-dir build wheel

COPY requirements.txt .
RUN pip wheel --no-cache-dir --wheel-dir /wheels -r requirements.txt

# Stage 2: Runtime
FROM python:3.12-slim

RUN groupadd -r appuser && useradd -r -g appuser appuser

WORKDIR /app
COPY --from=builder /wheels /wheels
RUN pip install --no-cache-dir --no-index --find-links=/wheels /wheels/* && \
    rm -rf /wheels

COPY . .
USER appuser
EXPOSE 8000
CMD ["gunicorn", "app:app", "-b", "0.0.0.0:8000", "-w", "4"]

# Final image: ~180 MB (vs ~900 MB with full python:3.12 + build tools)

Pattern 4: Rust Application

# Stage 1: Build
FROM rust:1.77-slim AS builder

WORKDIR /app
# Cache dependency compilation
COPY Cargo.toml Cargo.lock ./
RUN mkdir src && echo "fn main() {}" > src/main.rs && \
    cargo build --release && \
    rm -rf src

COPY src ./src
RUN cargo build --release

# Stage 2: Runtime
FROM debian:bookworm-slim

RUN apt-get update && apt-get install -y --no-install-recommends \
    ca-certificates && \
    rm -rf /var/lib/apt/lists/*

COPY --from=builder /app/target/release/myapp /usr/local/bin/

EXPOSE 8080
CMD ["myapp"]

# Final image: ~80 MB (slim Debian + static binary + CA certs)

Advanced Optimization Techniques

Using Distroless Images

Google’s distroless images contain only your application and its runtime dependencies — no shell, no package manager, no utilities. This dramatically reduces the attack surface:

# For Java applications
FROM eclipse-temurin:21-jdk AS builder
WORKDIR /app
COPY . .
RUN ./gradlew bootJar

FROM gcr.io/distroless/java21-debian12
COPY --from=builder /app/build/libs/app.jar /app.jar
CMD ["app.jar"]
# No shell access = no shell exploits

Buildkit Cache Mounts

Docker BuildKit supports cache mounts that persist across builds without being included in the final image:

# syntax=docker/dockerfile:1

FROM golang:1.22-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./

# Cache Go modules across builds
RUN --mount=type=cache,target=/go/pkg/mod \
    go mod download

COPY . .
RUN --mount=type=cache,target=/go/pkg/mod \
    --mount=type=cache,target=/root/.cache/go-build \
    CGO_ENABLED=0 go build -o /server ./cmd/server

Minimizing Layer Count

Each RUN instruction creates a new layer. Combining related commands reduces layers and avoids orphaned files:

# Bad: 3 layers, apt cache persists in layer 1
RUN apt-get update
RUN apt-get install -y curl
RUN rm -rf /var/lib/apt/lists/*

# Good: 1 layer, apt cache is cleaned in the same layer
RUN apt-get update && \
    apt-get install -y --no-install-recommends curl && \
    rm -rf /var/lib/apt/lists/*

Measuring Your Progress

Use docker images and docker history to understand where your image size comes from:

# See image sizes
docker images myapp
# REPOSITORY   TAG       SIZE
# myapp        latest    89MB

# See layer breakdown
docker history myapp:latest
# IMAGE          SIZE      CREATED BY
# abc123         4.2MB     COPY --from=builder /app/dist ./dist
# def456         32MB      COPY --from=deps /app/node_modules ...
# ghi789         52MB      /bin/sh -c #(nop) ADD file:... (base)

# Use dive for interactive layer exploration
# https://github.com/wagoodman/dive
dive myapp:latest

The tool dive is particularly useful — it shows you exactly what files each layer adds and highlights wasted space from files that were added in one layer and removed in a later one.

A Checklist for Every Dockerfile

Start from the smallest appropriate base image (alpine, slim, distroless, or scratch)
Use multi-stage builds to separate build and runtime
Copy dependency manifests before source code (leverage layer caching)
Install only production dependencies in the runtime stage
Create a .dockerignore file
Run as a non-root user
Combine RUN commands to minimize layers
Strip debug symbols from compiled binaries
Use BuildKit cache mounts for package managers
Scan the final image with docker scout or trivy

Smaller images are not just a nice-to-have optimization. They are faster to deploy, cheaper to store, more secure by default, and easier to debug when things go wrong. The 30 minutes you spend optimizing your Dockerfile will pay for itself every single time that image is pulled, pushed, or scanned.

Why Your Docker Images Are 10x Too Large: A Practical Guide to Multi-Stage Builds

ByMichael Sun

Your Node.js Image Is Probably 1.2 GB. It Should Be 120 MB.

Why Images Get Fat

1. Using the Wrong Base Image

2. Including Build Dependencies in the Runtime Image

3. Not Leveraging Layer Caching

4. Forgetting .dockerignore

Multi-Stage Builds: The Complete Guide

Pattern 1: Go Application

Pattern 2: Node.js Application

Pattern 3: Python Application

Pattern 4: Rust Application

Advanced Optimization Techniques

Using Distroless Images

Buildkit Cache Mounts

Minimizing Layer Count

Measuring Your Progress

A Checklist for Every Dockerfile

By Michael Sun

Related Post

Observability in 2026: OpenTelemetry, eBPF, and the Death of Traditional Monitoring

Supply Chain Security After xz: What Changed and What Didn’t

Why Your Staging Environment Lies to You: Closing the Dev-Prod Gap

Leave a Reply Cancel reply

You missed

Technical Writing for Engineers: How Documentation Becomes Your Competitive Advantage

WebAssembly Beyond the Browser: Server-Side Wasm in 2026

Local-First Software: CRDTs, Sync Engines, and Why the Cloud Isn’t Always the Answer

Observability in 2026: OpenTelemetry, eBPF, and the Death of Traditional Monitoring

ByMichael Sun

Your Node.js Image Is Probably 1.2 GB. It Should Be 120 MB.

Why Images Get Fat

1. Using the Wrong Base Image

2. Including Build Dependencies in the Runtime Image

3. Not Leveraging Layer Caching

4. Forgetting .dockerignore

Multi-Stage Builds: The Complete Guide

Pattern 1: Go Application

Pattern 2: Node.js Application

Pattern 3: Python Application

Pattern 4: Rust Application

Advanced Optimization Techniques

Using Distroless Images

Buildkit Cache Mounts

Minimizing Layer Count

Measuring Your Progress

A Checklist for Every Dockerfile

Related Reading

By Michael Sun

Related Post

Leave a Reply Cancel reply

You missed