Your Node.js Image Is Probably 1.2 GB. It Should Be 120 MB.
I recently audited the Docker images for a client running 15 microservices on Kubernetes. Their average image size was 1.4 GB. After applying multi-stage builds and a few other optimizations, the average dropped to 89 MB — a 94% reduction. Their deployment times went from 4 minutes to 45 seconds. Their container registry bill dropped by 70%. Cold starts on new nodes went from 30 seconds to under 5.
Large Docker images are not just a storage problem. They are a deployment speed problem, a security surface problem, and a cost problem. Every layer you ship contains potential CVEs that scanners will flag and your security team will ask about. Every megabyte you push takes bandwidth during deployments and pulls during scaling events. The cumulative impact is enormous, and the fix is almost always straightforward.
Why Images Get Fat
Most oversized Docker images come from one of four causes:
1. Using the Wrong Base Image
The most common mistake is starting from a full OS image when you do not need one:
# Bad: 900+ MB base image
FROM ubuntu:22.04
RUN apt-get update && apt-get install -y nodejs npm
COPY . /app
RUN npm install
CMD ["node", "server.js"]
# Final image: ~1.2 GB
# Better: 340 MB base image
FROM node:20
COPY . /app
RUN npm install
CMD ["node", "server.js"]
# Final image: ~400 MB
# Best: 50 MB base image
FROM node:20-alpine
COPY . /app
RUN npm install --production
CMD ["node", "server.js"]
# Final image: ~80 MB
The Alpine-based images use musl libc instead of glibc, which occasionally causes compatibility issues with native Node.js addons (bcrypt, sharp, etc.). For those cases, use the slim variant instead:
FROM node:20-slim
# ~200 MB base, glibc-compatible, still much smaller than full node:20
2. Including Build Dependencies in the Runtime Image
This is the problem multi-stage builds solve directly. When you compile a Go binary, you need the Go toolchain. When you build a React app, you need Node.js, npm, and webpack. When you compile a Python C extension, you need gcc and python-dev. None of these belong in your production image.
3. Not Leveraging Layer Caching
Docker builds each instruction as a layer, and layers are cached. If you copy your entire source tree before installing dependencies, changing a single line of code invalidates the dependency installation cache:
# Bad: any code change rebuilds node_modules
COPY . /app
RUN npm install
# Good: dependencies are cached unless package.json changes
COPY package.json package-lock.json /app/
RUN npm install
COPY . /app
4. Forgetting .dockerignore
Without a .dockerignore file, Docker sends everything in your build context to the daemon, including node_modules, .git, test files, documentation, and IDE configuration:
# .dockerignore
node_modules
.git
.gitignore
*.md
docs/
tests/
.env*
.vscode/
.idea/
coverage/
dist/
Dockerfile
docker-compose*.yml
Multi-Stage Builds: The Complete Guide
Multi-stage builds let you use multiple FROM statements in a single Dockerfile. Each FROM starts a new stage, and you can copy artifacts from one stage to another. The final image only contains the layers from the last stage.
Pattern 1: Go Application
Go is the poster child for multi-stage builds because Go compiles to a static binary that needs no runtime:
# Stage 1: Build
FROM golang:1.22-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -ldflags="-w -s" -o /app/server ./cmd/server
# Stage 2: Runtime
FROM scratch
COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
COPY --from=builder /app/server /server
EXPOSE 8080
ENTRYPOINT ["/server"]
# Build image: ~800 MB (Go toolchain + dependencies)
# Final image: ~8 MB (just the binary + CA certs)
The -ldflags="-w -s" flag strips debug information, reducing the binary size by 20-30%. Building with CGO_ENABLED=0 produces a fully static binary that runs on scratch (an empty base image).
Pattern 2: Node.js Application
# Stage 1: Install dependencies
FROM node:20-alpine AS deps
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci --production
# Stage 2: Build (if you have a build step)
FROM node:20-alpine AS builder
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
COPY . .
RUN npm run build
# Stage 3: Runtime
FROM node:20-alpine
RUN addgroup -g 1001 -S nodejs && \
adduser -S nextjs -u 1001
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY --from=builder /app/dist ./dist
COPY package.json ./
USER nextjs
EXPOSE 3000
CMD ["node", "dist/server.js"]
# Final image: ~120 MB (Alpine + Node runtime + production deps + built code)
Pattern 3: Python Application
# Stage 1: Build wheel files
FROM python:3.12-slim AS builder
WORKDIR /app
RUN pip install --no-cache-dir build wheel
COPY requirements.txt .
RUN pip wheel --no-cache-dir --wheel-dir /wheels -r requirements.txt
# Stage 2: Runtime
FROM python:3.12-slim
RUN groupadd -r appuser && useradd -r -g appuser appuser
WORKDIR /app
COPY --from=builder /wheels /wheels
RUN pip install --no-cache-dir --no-index --find-links=/wheels /wheels/* && \
rm -rf /wheels
COPY . .
USER appuser
EXPOSE 8000
CMD ["gunicorn", "app:app", "-b", "0.0.0.0:8000", "-w", "4"]
# Final image: ~180 MB (vs ~900 MB with full python:3.12 + build tools)
Pattern 4: Rust Application
# Stage 1: Build
FROM rust:1.77-slim AS builder
WORKDIR /app
# Cache dependency compilation
COPY Cargo.toml Cargo.lock ./
RUN mkdir src && echo "fn main() {}" > src/main.rs && \
cargo build --release && \
rm -rf src
COPY src ./src
RUN cargo build --release
# Stage 2: Runtime
FROM debian:bookworm-slim
RUN apt-get update && apt-get install -y --no-install-recommends \
ca-certificates && \
rm -rf /var/lib/apt/lists/*
COPY --from=builder /app/target/release/myapp /usr/local/bin/
EXPOSE 8080
CMD ["myapp"]
# Final image: ~80 MB (slim Debian + static binary + CA certs)
Advanced Optimization Techniques
Using Distroless Images
Google’s distroless images contain only your application and its runtime dependencies — no shell, no package manager, no utilities. This dramatically reduces the attack surface:
# For Java applications
FROM eclipse-temurin:21-jdk AS builder
WORKDIR /app
COPY . .
RUN ./gradlew bootJar
FROM gcr.io/distroless/java21-debian12
COPY --from=builder /app/build/libs/app.jar /app.jar
CMD ["app.jar"]
# No shell access = no shell exploits
Buildkit Cache Mounts
Docker BuildKit supports cache mounts that persist across builds without being included in the final image:
# syntax=docker/dockerfile:1
FROM golang:1.22-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
# Cache Go modules across builds
RUN --mount=type=cache,target=/go/pkg/mod \
go mod download
COPY . .
RUN --mount=type=cache,target=/go/pkg/mod \
--mount=type=cache,target=/root/.cache/go-build \
CGO_ENABLED=0 go build -o /server ./cmd/server
Minimizing Layer Count
Each RUN instruction creates a new layer. Combining related commands reduces layers and avoids orphaned files:
# Bad: 3 layers, apt cache persists in layer 1
RUN apt-get update
RUN apt-get install -y curl
RUN rm -rf /var/lib/apt/lists/*
# Good: 1 layer, apt cache is cleaned in the same layer
RUN apt-get update && \
apt-get install -y --no-install-recommends curl && \
rm -rf /var/lib/apt/lists/*
Measuring Your Progress
Use docker images and docker history to understand where your image size comes from:
# See image sizes
docker images myapp
# REPOSITORY TAG SIZE
# myapp latest 89MB
# See layer breakdown
docker history myapp:latest
# IMAGE SIZE CREATED BY
# abc123 4.2MB COPY --from=builder /app/dist ./dist
# def456 32MB COPY --from=deps /app/node_modules ...
# ghi789 52MB /bin/sh -c #(nop) ADD file:... (base)
# Use dive for interactive layer exploration
# https://github.com/wagoodman/dive
dive myapp:latest
The tool dive is particularly useful — it shows you exactly what files each layer adds and highlights wasted space from files that were added in one layer and removed in a later one.
A Checklist for Every Dockerfile
- Start from the smallest appropriate base image (alpine, slim, distroless, or scratch)
- Use multi-stage builds to separate build and runtime
- Copy dependency manifests before source code (leverage layer caching)
- Install only production dependencies in the runtime stage
- Create a .dockerignore file
- Run as a non-root user
- Combine RUN commands to minimize layers
- Strip debug symbols from compiled binaries
- Use BuildKit cache mounts for package managers
- Scan the final image with
docker scoutortrivy
Smaller images are not just a nice-to-have optimization. They are faster to deploy, cheaper to store, more secure by default, and easier to debug when things go wrong. The 30 minutes you spend optimizing your Dockerfile will pay for itself every single time that image is pulled, pushed, or scanned.
