Skip to the content.
Docker best practices often missed in production

Docker best practices often missed in production

Docker Production Best Practices Most Teams Skip — Until a CVE Forces Them

Meta description: Avoid costly Docker mistakes in production. Learn image optimization, multi-stage builds, security hardening, and CI enforcement in one practical guide.

Primary keyword: Docker production best practices Secondary keywords: Docker image size, Docker security, multi-stage builds


Introduction

Most teams ship a Docker image that works and move on. Six months later they’re staring at 40 CVEs, a 1.2 GB image, and a two-minute pull time. Here’s what makes that scenario avoidable: none of it requires starting over. A handful of Docker production best practices — applied consistently — changes the entire trajectory of your containers. This guide covers the ones that actually matter: image size, layer hygiene, multi-stage builds, security, health checks, and CI enforcement. If you implement even one, you’ll be ahead of most production setups shipping today.


Treat Image Size Like It Affects Your Security Posture (Because It Does)

A smaller image pulls faster. That’s the least interesting part.

Fewer packages means fewer things to patch. If your OWASP scan shows 20 vulnerabilities today, shipping less is the simplest way to have fewer tomorrow. This is the argument for base image discipline that doesn’t get made enough: size is a proxy for attack surface.

Runtime images and build environments should be separate things. What your container actually needs to run is usually much smaller than what it took to compile. The mistake is treating them as the same.


Combine RUN Commands and Clean Up in the Same Layer

Docker creates a new layer for every RUN, COPY, and ADD instruction. The problem isn’t the layer count — cleanup commands in a separate RUN don’t remove content from earlier layers. That content sticks around in image history.

The rule: setup, install, and cleanup in one RUN. Less readable, but you stop shipping package manager caches to production by accident.

RUN apt-get update && apt-get install -y --no-install-recommends curl \
    && rm -rf /var/lib/apt/lists/* /tmp/*

Run docker history on your current images. If you see multiple partial-cleanup layers, that’s what to fix first.


Use Multi-Stage Builds — They Should Be Your Default

Compilers and package managers don’t belong in your runtime container. Multi-stage builds are the cleanest way to enforce that separation.

FROM golang:1.22-alpine AS builder
WORKDIR /app
COPY . .
RUN go build -o /out/myapp ./...

FROM gcr.io/distroless/static
COPY --from=builder /out/myapp /usr/local/bin/myapp
USER 1002:1002
CMD ["/usr/local/bin/myapp"]

The final image has the binary and whatever it needs at runtime — no go, no apk, no cache. For Go services that’s usually a 50–80 MB difference. Some languages are even more dramatic. This single change often cuts image size in half.


Never Run Your Containers as Root

Take any exploit script. Run it as root, then run it as a non-privileged app user. Same vulnerability — much less damage. The argument makes itself.

In local dev, USER 1002:1002 is still worth setting. In stricter environments, an unprivileged user adds a real second layer of protection before anything reaches the kernel.

RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser:appgroup

This is the one practice that, if your team agrees on nothing else, immediately changes your security posture.


Add a Health Check — or You’re Routing Traffic to a Broken Container

A container in Up status can be completely stuck. If startup hangs at a DB migration, the orchestrator won’t know — it’ll keep routing traffic to a process that isn’t ready.

HEALTHCHECK --interval=20s --timeout=5s --retries=3 \
  CMD curl -f http://localhost:8080/health || exit 1

For Kubernetes, livenessProbe handles deadlock detection and readinessProbe handles traffic routing. For Docker Compose, restart: on-failure. Without any of this, you’re chasing an issue inside a container that looks perfectly healthy on your dashboard.


Keep Debug Images Separate From Production Images

Two categories worth knowing: sourceful and sourceless.

A sourceful image carries the build toolchain and possibly source files. You can exec in, poke around, and check the filesystem — useful when iterating in staging, but everything useful for debugging is also useful for an attacker.

A sourceless image has only what the process needs to run: binary, runtime dependencies, nothing else.

Sourceful in staging is sometimes fine. In production, build from logs and traces. When you genuinely need a shell in prod, pull a debug image built from the same base — don’t bake it in by default. Build it, version it, pull it when you need it.


Let CI Enforce This, Not Your Memory

A Dockerfile that’s correct today can still cause problems if the base image shifts under you. At minimum, your CI pipeline should:

  • Run docker build with --no-cache in at least one job (cold-build failures are real and silent otherwise)
  • Run grype or docker scan and block on critical/high findings
  • Track image digests in deploy manifests instead of latest

Digest pinning makes rollbacks mean something. Without it, “rolling back to the previous version” might pull an image built on a different base than what you tested.


The Small Things That Compound Over Time

Pin base images to specific patched versions — not just alpine:3. Rebuild nightly or on base image updates; dependency drift shows up quietly, then all at once. And if you have an unusual chmod or cksum step, leave a comment explaining why it’s there, not just what it does. The “what” is readable; the “why” is what you forget six months later.


Conclusion

These aren’t theoretical hardening tips — they’re the practices that separate a container that works from one that’s defensible six months later. I’ve watched teams skip them for months. Then a CVE lands, eight services need to be rebuilt at once, and a fix that takes 20 minutes during a quiet sprint turns into a two-day scramble.

Start with one: no root, or nothing extra in the runtime image. It already changes the picture. The rest becomes easier once the foundation is clean.


If this was useful, I write about Docker, platform engineering, and DevOps patterns regularly. Subscribe below and get new posts straight to your inbox — no spam, unsubscribe anytime.

Subscribe to the newsletter →

🎓

16-Week Bootcamp

AI-Augmented Platform Engineering Bootcamp

Go deeper — hands-on Terraform, Kubernetes, GitOps, and AI-assisted operations. Build a complete internal developer platform from scratch.

View Course →
Ashok Valakatla

Written by

Ashok Valakatla

10 years building production infrastructure on Azure and AWS. Azure Certified DevOps Expert, Solution Architect, and Administrator.

Follow on LinkedIn

Leave a Comment

← Back to all posts

More Posts

How DevOps CI/CD Practices Actually Protect Companies — Lessons From the Anthropic CLI Source Leak

How DevOps CI/CD Practices Actually Protect Companies — Lessons From the Anthropic CLI Source Leak

The Anthropic CLI source code leak wasn't a cyberattack — it was a CI/CD failure. Here's what went wrong and how proper pipeline security prevents it.

Devops Apr 04, 2026 16 min read
GitOps Principles: How Platform Teams Deploy Apps and Cloud Resources Faster — and Finally Build the Things That Matter

GitOps Principles: How Platform Teams Deploy Apps and Cloud Resources Faster — and Finally Build the Things That Matter

GitOps extends beyond app deployments — cloud resources, databases, and networks can all be Git-managed, giving platform teams back the time to build real improvements.

Gitops Apr 03, 2026 11 min read
Terraform enterprise strategy for multi-tenant customers

Terraform enterprise strategy for multi-tenant customers

Manage multi-tenant Terraform at scale with state isolation, layered provisioning, feature flags, and a CI/CD strategy built for parallel development.

Terraform Mar 24, 2026 5 min read
Platform Engineering vs DevOps — What's the Difference and Why It Matters for Your Career

Platform Engineering vs DevOps — What's the Difference and Why It Matters for Your Career

DevOps and Platform Engineering are not the same thing. Understanding the difference can shape which career path you choose — and how fast you grow.

Devops Mar 21, 2026 6 min read
The DevOps Roadmap: A Practical Guide for 2026

The DevOps Roadmap: A Practical Guide for 2026

Stop jumping straight into Kubernetes. Here's the structured, battle-tested roadmap to become a DevOps or Platform Engineer in 2026 — built on real fundamentals.

Devops Mar 20, 2026 11 min read
Terraform Best Practices for Team Environments

Terraform Best Practices for Team Environments

Remote state, module versioning, variable validation, and drift detection — what it actually takes to run Terraform in a team.

Terraform Mar 01, 2026 6 min read