Docker Fundamentals: From Images to Production Containers

Master Docker containers, images, Dockerfiles, docker-compose, volumes, and networking. A comprehensive guide for developers getting started with containerization.

published: reading time: 30 min read author: GeekWorkBench

Docker Fundamentals: From Images to Production Containers

Docker has reshaped how we build, ship, and run applications. If you are still manually installing dependencies and fighting “works on my machine” problems, you are leaving performance on the table. Containerization is not a passing trend — it is the standard deployment model for modern software.

This guide walks you through everything you need to go from Docker beginner to someone who can containerize a real application and run it reliably.

Introduction

Docker is a platform for packaging applications into self-contained units called containers. A container bundles your code, runtime, system tools, libraries, and settings — everything the application needs to run — independent of the host system.

Containers share the host kernel and do not emulate hardware. This makes them lightweight and fast to start. A VM needs to boot an entire operating system; a container starts in seconds.

Docker uses client-server architecture. The Docker client talks to the Docker daemon, which handles building, running, and distributing containers. You interact primarily with the CLI, but a RESTful API does the actual work.

Core Concepts

An image is a read-only template with instructions for creating a container. Think of it as a snapshot of a filesystem with some metadata about how to run the process.

Images are built in layers. Each instruction in a Dockerfile creates a new layer. When you change something, only that layer and its dependents rebuild. This caching mechanism is what makes Docker builds fast after the first run.

Here is a simple Dockerfile for a Node.js application:

FROM node:20-alpine

WORKDIR /app

COPY package*.json ./

RUN npm ci --only=production

COPY . .

USER node

EXPOSE 3000

CMD ["node", "server.js"]

The FROM instruction sets the base image. Using Alpine variants keeps your images small — around 5MB for the base OS layer versus 700MB+ for a full Ubuntu image.

Docker images use a naming convention: registry/repository:tag. If you do not specify a tag, Docker defaults to latest.

docker pull nginx:1.25-alpine
docker pull nginx:1.25
docker pull nginx

The three commands above pull different images. The first explicitly specifies version 1.25 of the Alpine variant. The second pulls the same version without the Alpine suffix. The third gets the latest tag.

For production, always pin exact versions. latest is a moving target that will bite you when it changes unexpectedly.

A container is a runnable instance of an image. You can create, start, stop, and delete containers. Each container is isolated from other containers and the host system, but they can communicate through defined networking channels.

docker run nginx:latest

This pulls the nginx image if not present locally, creates a container from it, and starts it. By default, nginx runs in the foreground and binds to port 80 inside the container.

To run it in detached mode with port mapping:

docker run -d -p 8080:80 --name my-nginx nginx:latest

The -d flag runs the container detached (in the background). -p 8080:80 maps host port 8080 to container port 80. The --name flag gives your container a memorable name instead of a random one.

docker ps                    # List running containers
docker ps -a                 # List all containers (including stopped)
docker stop my-nginx         # Stop a running container
docker start my-nginx        # Start a stopped container
docker restart my-nginx      # Stop then start
docker rm my-nginx           # Remove a container (must be stopped)
docker logs -f my-nginx      # Follow logs in real-time
docker exec -it my-nginx sh  # Get shell inside running container

The docker exec command is indispensable for debugging. Jump into a running container and inspect its filesystem, check environment variables, or figure out why something is not working.

Building Images with Dockerfiles

A Dockerfile is a script with instructions for building your custom image. Each instruction creates a new layer, and Docker caches layers when possible to speed up rebuilds.

Multi-stage builds let you use multiple FROM statements to separate build-time and runtime environments. This keeps production images lean by excluding build tools.

The final image only contains the production-ready artifact. The build dependencies never make it into the runtime image.

Docker Compose for Multi-Container Applications

Most real applications need multiple services: a web server, a database, a cache layer. Docker Compose manages these multi-container setups through a YAML configuration file.

docker-compose.yml Structure

version: "3.8"

services:
  web:
    build:
      context: .
      dockerfile: Dockerfile
    ports:
      - "3000:3000"
    environment:
      - NODE_ENV=production
      - DATABASE_URL=postgres://db:5432/app
    depends_on:
      - db
      - redis
    restart: unless-stopped

  db:
    image: postgres:15-alpine
    volumes:
      - postgres_data:/var/lib/postgresql/data
    environment:
      POSTGRES_DB: app
      POSTGRES_USER: user
      POSTGRES_PASSWORD_FILE: /run/secrets/db_password
    secrets:
      - db_password

  redis:
    image: redis:7-alpine
    volumes:
      - redis_data:/data

volumes:
  postgres_data:
  redis_data:

secrets:
  db_password:
    file: ./secrets/db_password.txt

Compose Commands

docker-compose up -d          # Start all services
docker-compose down           # Stop and remove containers
docker-compose down -v        # Also remove volumes
docker-compose logs -f web    # Follow logs for web service
docker-compose ps             # List running services
docker-compose exec db psql   # Run psql in db container
docker-compose restart web     # Restart web service

The depends_on directive ensures services start in the right order. Note that it only waits for the container to start, not for the application inside to be ready. For databases and similar services, you often need a healthcheck or a startup script that waits for dependencies.

Data Persistence with Volumes

Containers are ephemeral by default. Any data written inside a container disappears when the container is removed. Volumes solve this by providing persistent storage that exists independent of containers.

Volume Types

Named volumes are the simplest approach:

services:
  db:
    image: postgres:15-alpine
    volumes:
      - postgres_data:/var/lib/postgresql/data

volumes:
  postgres_data:

Docker creates the volume if it does not exist. The data persists across container restarts and removals.

Bind mounts map a host directory into the container:

services:
  app:
    image: node:20-alpine
    volumes:
      - ./src:/app/src:ro

The :ro suffix makes the mount read-only. Bind mounts are useful for development, letting you edit code on your host and see changes immediately inside the container.

tmpfs mounts store data in memory only, useful for sensitive data you do not want persisted:

services:
  cache:
    image: redis:7-alpine
    tmpfs:
      - /data

Container Networking

Docker provides several networking modes. Understanding them helps you design proper communication between services.

Network Drivers

DriverUse Case
bridgeDefault for standalone containers
hostRemove network isolation, use host network directly
overlayConnect containers across multiple Docker hosts
macvlanAssign MAC address to containers for legacy applications
noneDisable all networking

Custom Bridge Networks

Creating a custom bridge network enables automatic DNS resolution between containers by name:

version: "3.8"

services:
  web:
    build: .
    networks:
      - frontend

  api:
    build: ./api
    networks:
      - frontend
      - backend

  db:
    image: postgres:15-alpine
    networks:
      - backend
    volumes:
      - db_data:/var/lib/postgresql/data

networks:
  frontend:
  backend:

The web service can reach api by its service name, but cannot reach db directly because they are on separate networks. This network segmentation adds security by limiting what services can communicate.

Service Discovery

Within a custom bridge network, containers discover each other by the service name defined in compose. If you have a service named postgres, other containers can reach it at postgres:5432.

Docker embeds a DNS resolver that handles this resolution automatically. You do not need to hardcode IP addresses; they can change as containers restart.

Environment Variables and Configuration

Environment variables are the primary way to configure containerized applications at runtime. Docker provides several mechanisms for setting them.

Setting Environment Variables

services:
  web:
    environment:
      - NODE_ENV=production
      - API_KEY=${API_KEY}
      - DEBUG=false

You can also use an .env file with Docker Compose:

# .env file
NODE_ENV=production
API_KEY=your-secret-key
environment:
  - NODE_ENV=${NODE_ENV}
  - API_KEY=${API_KEY}

For secrets in production, use Docker secrets or an external secrets manager. Never commit secrets to version control, even in private repositories.

Building for Production

A production Docker workflow differs from development in several ways.

Image Optimization Checklist

Use Alpine-based images to reduce attack surface and pull times. Pin exact versions for all images. Use multi-stage builds to exclude build artifacts from production. Run containers as non-root users. Remove unnecessary tools and shells from production images.

A hardened production Dockerfile might look like:

# Build stage
FROM node:20 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build

# Production stage
FROM node:20-alpine
# Add labels for metadata
LABEL maintainer="dev@example.com"
LABEL version="1.0.0"

WORKDIR /app
# Create non-root user
RUN addgroup -g 1001 -S appgroup && \
    adduser -S appuser -u 1001 -G appgroup

COPY --from=builder --chown=appuser:appgroup /app/dist ./dist
COPY --from=builder --chown=appuser:appgroup /app/node_modules ./node_modules
COPY package*.json ./

USER appuser
EXPOSE 3000

HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
  CMD wget --no-verbose --tries=1 --spider http://localhost:3000/health || exit 1

CMD ["node", "dist/server.js"]

The HEALTHCHECK instruction tells Docker how to verify the container is healthy. This enables proper health monitoring and ensures load balancers only send traffic to healthy instances.

Container Health and Monitoring

Containers can fail in several ways: the application can crash, hang, or run out of memory. Docker provides restart policies to handle these scenarios automatically.

Restart Policies

services:
  web:
    image: nginx:latest
    restart: unless-stopped

  worker:
    image: my-worker:latest
    restart: on-failure
PolicyBehavior
noDo not restart (default)
on-failureRestart only if container exits with non-zero code
unless-stoppedRestart unless explicitly stopped
alwaysAlways restart, including after Docker daemon restart

For production services, unless-stopped or always are usually appropriate. Think about whether you want the service to restart after a code bug that causes repeated crashes, which could mask an underlying issue.

Containerized applications need CI/CD pipelines that handle building, testing, and pushing images to a registry. The patterns here cover the build stage, multi-stage optimizations, and registry authentication.

BuildKit enables parallel layer processing for faster builds. You can cache npm dependencies between runs:

# Enable BuildKit
# DOCKER_BUILDKIT=1 docker build

# Use inline cache for faster rebuilds
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN --mount=type=cache,target=/root/.npm npm ci
COPY . .
RUN npm run build

FROM node:20-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY package*.json ./
RUN npm ci --only=production
CMD ["node", "dist/server.js"]

Production deployments typically use orchestration platforms beyond Docker Compose. These tools handle container lifecycle, scaling across nodes, and service distribution.

Docker containers work well in many scenarios but are not always the right tool.

When to Use Docker

Use Docker when:

  • Packaging applications for consistent deployment across environments
  • Microservices architectures where services need isolation
  • CI/CD pipelines requiring reproducible build environments
  • Scaling applications horizontally with container orchestration
  • Running multiple versions of dependencies side by side
  • Development environments needing parity with production

Use Docker Compose when:

  • Local development with multiple coordinated services
  • Running integration tests in isolated containers
  • Small-scale deployments without Kubernetes
  • Demonstrating application stacks to stakeholders

When Not to Use Docker

Consider alternatives when:

  • Applications requiring real-time kernel access or hardware passthrough
  • Desktop applications with complex GUI requirements (native packaging may be better)
  • Very small scripts that have minimal dependencies
  • Applications with extreme performance requirements where container overhead matters
  • Windows-specific workloads (Docker on Windows has more limitations)

Containerization Decision Tree

graph TD
    A[Need to deploy application?] --> B{Multiple environments?}
    B -->|Yes| C[Use containers]
    B -->|No| D{Scalability needed?}
    D -->|Yes| C
    D -->|No| E{Single host only?}
    E -->|Yes| F[Consider Docker Compose]
    E -->|No| C
    C --> G[Use multi-stage builds]
    C --> H[Configure health checks]
    C --> I[Set resource limits]

Production Failure Scenarios

Containers fail in predictable ways. Understanding these helps you design resilient systems.

FailureImpactMitigation
Application crashContainer exits with non-zero codeImplement restart policies, health checks, and logging
OOM killContainer terminated, potential data lossSet memory limits, monitor memory usage
Disk fullContainer cannot write logs or dataUse log rotation, monitor disk usage, mount tmpfs for temp data
Network partitionContainer cannot reach dependenciesImplement retry logic, circuit breakers, health checks
Image pull failurePod cannot start, app unavailableUse private registry, pre-pull images, pin exact versions
Port conflictsContainer fails to startConfigure port mapping carefully, use Docker Compose
Volume mount failureData inaccessible, potential crashVerify volume paths exist, use named volumes
Dependency outageApplication cannot serve trafficImplement graceful degradation, health checks

Common Container Exit Codes

Exit CodeMeaningResolution
0Application exited successfullyNormal termination
1Application exited with general errorCheck application logs
137SIGKILL (OOM or manual kill)Increase memory limit, check for memory leaks
139Segfault or SIGSEGVApplication bug, check core dump
143SIGTERM (graceful shutdown)Normal during restart or stop
255Exit status out of rangeApplication error, check entrypoint

Common Pitfalls / Anti-Patterns

Image Building Pitfalls

Not using multi-stage builds

# Anti-pattern: Build artifacts in production image
FROM node:20
WORKDIR /app
COPY . .
RUN npm install
RUN npm run build
CMD ["node", "dist/server.js"]

# Better: Multi-stage build
FROM node:20 AS builder
WORKDIR /app
COPY . .
RUN npm ci && npm run build

FROM node:20-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
CMD ["node", "dist/server.js"]

Not cleaning up in the same layer

# Anti-pattern: Build artifacts persist
RUN apt-get update && apt-get install build-essential

# Better: Clean in same layer
RUN apt-get update && apt-get install -y build-essential \
    && rm -rf /var/lib/apt/lists/*

Copying too much

# Anti-pattern: Copies everything including .git, node_modules
COPY . .

# Better: Only copy necessary files
COPY package*.json ./
COPY src ./src

Container Execution Pitfalls

Running as root

# Anti-pattern: Running as root user
services:
  web:
    image: myapp:1.0.0
    user: root

# Better: Run as non-root
services:
  web:
    image: myapp:1.0.0
    user: "10001"

Not setting resource limits

# Anti-pattern: No limits means unbounded resource usage
services:
  web:
    image: myapp:1.0.0

# Better: Set appropriate limits
services:
  web:
    image: myapp:1.0.0
    deploy:
      resources:
        limits:
          memory: 512M
          cpus: '0.5'
        reservations:
          memory: 256M
          cpus: '0.25'

Missing health checks

# Anti-pattern: No health check, Docker does not know if app is healthy
services:
  web:
    image: myapp:1.0.0

# Better: Define health check
services:
  web:
    image: myapp:1.0.0
    healthcheck:
      test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:3000/health"]
      interval: 30s
      timeout: 3s
      retries: 3
      start_period: 10s

Networking Pitfalls

Exposing unnecessary ports

# Anti-pattern: Exposing debug ports
services:
  web:
    image: myapp:1.0.0
    ports:
      - "3000:3000"
      - "9229:9229"  # Debug port exposed

# Better: Only expose needed ports
services:
  web:
    image: myapp:1.0.0
    ports:
      - "3000:3000"

Not using custom networks

# Anti-pattern: Default bridge, no automatic DNS
services:
  web:
    image: myapp:1.0.0
  db:
    image: postgres:15
    ports:
      - "5432:5432"  # Unnecessary exposure

# Better: Custom network with proper isolation
services:
  web:
    image: myapp:1.0.0
    networks:
      - backend
  db:
    image: postgres:15
    networks:
      - backend

networks:
  backend:

Observability Checklist

Containerized applications need comprehensive monitoring to catch issues early.

Metrics to Collect

graph LR
    A[Container Metrics] --> B[CPU Usage]
    A --> C[Memory Usage]
    A --> D[Network I/O]
    A --> E[Block I/O]
    F[Application Metrics] --> G[Request Rate]
    F --> H[Error Rate]
    F --> I[Latency]
    F --> J[Active Connections]

Container-level metrics:

  • CPU usage percentage vs limit
  • Memory usage percentage vs limit
  • Network bytes sent and received
  • Block I/O read and write bytes
  • Container restart count

Application-level metrics:

  • Request throughput (requests per second)
  • Error rate (4xx, 5xx responses)
  • Response latency (p50, p95, p99)
  • Active connections (database, Redis, HTTP)
  • Queue depth for async processing

Logging Best Practices

graph TD
    A[Container STDOUT STDERR] --> B[Log Driver]
    B --> C[Centralized Logging]
    C --> D[ELK Stack]
    C --> E[Loki]
    C --> F[CloudWatch]
    G[Structured Logs] --> C
    G --> H[JSON Format]
    G --> I[Correlation ID]
  • Use structured logging: JSON format enables easier parsing and querying
  • Include correlation IDs: Trace requests across services
  • Log to STDOUT/STDERR: Let Docker handle log routing, not files
  • Implement log rotation: Prevent disk exhaustion
  • Ship logs centrally: Aggregate logs from all containers
# Configure log rotation in daemon.json
{
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "10m",
    "max-file": "3"
  }
}

Alerts to Configure

Critical (immediate action):

  • Container restart count > 5 in 10 minutes
  • Memory usage > 90% of limit for > 5 minutes
  • Container exit (unexpected termination)
  • Health check failure for > 2 minutes

Warning (investigate soon):

  • CPU usage > 80% of limit for > 10 minutes
  • Disk usage > 80% on volume
  • Restart count > 2 in 30 minutes
  • Health check degradation

Security Checklist

Container security requires defense in depth across multiple layers.

Image Security

Image selection:

  • Use official images from trusted registries when possible
  • Prefer Alpine or distroless images for smaller attack surface
  • Never use latest tag in production (pin exact versions)
  • Scan images for vulnerabilities before deployment
# Scan image locally
trivy image myapp:1.0.0

# In CI/CD pipeline
trivy image --exit-code 1 --severity HIGH,CRITICAL myapp:1.0.0

Dockerfile hardening:

# Use specific version, not latest
FROM node:20-alpine3.18

# Create non-root user
RUN addgroup -g 1001 -S appgroup && \
    adduser -S appuser -u 1001 -G appgroup

# Copy files with correct ownership
COPY --chown=appuser:appgroup . .

# Switch to non-root user
USER appuser

# Set explicit exposure
EXPOSE 3000

# Use exec form for CMD (proper signal handling)
CMD ["node", "server.js"]

Runtime Security

graph LR
    A[Runtime Security] --> B[Resource Limits]
    A --> C[Capability Drop]
    A --> D[No Privileged]
    A --> E[Read-only FS]
    B --> F[Memory Limit]
    B --> G[CPU Limit]
    C --> H[DROP ALL]
    C --> I[Add specific]

Security options for docker run:

# Run with security hardening
docker run \
  --read-only \
  --tmpfs /tmp:rw,noexec,nosuid,size=64m \
  --memory=512m \
  --memory-swap=512m \
  --cpus=1.0 \
  --user=10001 \
  --cap-drop=ALL \
  --security-opt=no-new-privileges \
  myapp:1.0.0

Security options for docker-compose.yml:

services:
  web:
    image: myapp:1.0.0
    read_only: true
    tmpfs:
      - /tmp:rw,noexec,nosuid,size=64m
    mem_limit: 512m
    memswap_limit: 512m
    cpus: 1.0
    user: "10001"
    cap_drop:
      - ALL
    security_opt:
      - no-new-privileges:true

Secret Management

Never:

  • Store secrets in environment variables
  • Commit secrets to Dockerfiles or docker-compose files
  • Use secrets in build arguments (they get baked into image layers)
  • Use ConfigMaps for sensitive data

Always:

  • Use Docker secrets for sensitive data in Compose
  • Use external secrets managers (Vault, AWS Secrets Manager)
  • Mount secrets as files or environment variables at runtime
  • Rotate secrets regularly
# docker-compose.yml with secrets
services:
  web:
    image: myapp:1.0.0
    secrets:
      - db_password
    environment:
      - DATABASE_PASSWORD_FILE=/run/secrets/db_password

secrets:
  db_password:
    file: ./secrets/db_password.txt

Interview Questions

1. Explain the difference between a container and a virtual machine. What are the trade-offs?

Expected answer points:

  • Containers share the host kernel; VMs each include a full OS
  • Containers start in seconds; VMs take minutes to boot
  • VMs provide complete isolation; containers share the host kernel which creates potential security boundaries
  • Performance: VMs have overhead from emulating hardware; containers run near bare-metal performance
  • Resource usage: VMs require dedicated RAM and CPU allocation; containers share resources dynamically
  • Portability: Containers are highly portable; VMs require hypervisor compatibility
2. How does Docker layer caching work, and why is it important for build performance?

Expected answer points:

  • Each Dockerfile instruction creates a new layer
  • Docker caches layers when instructions have not changed
  • When a layer changes, all subsequent layers must rebuild
  • Best practice: Order instructions from least to most frequently changing
  • COPY package*.json before COPY source code to leverage npm install caching
  • Use multi-stage builds to keep cache efficient and production images small
3. What is the difference between COPY and ADD instructions in a Dockerfile?

Expected answer points:

  • COPY is the preferred instruction for basic file copying
  • ADD can extract tar files and copy from URLs automatically
  • ADD can pull files from remote URLs, which can expose secrets in image layers
  • ADD auto-extraction can cause unexpected behavior with large archives
  • Recommendation: Use COPY unless you specifically need ADD tar extraction feature
4. How do multi-stage builds improve container security and reduce image size?

Expected answer points:

  • Multi-stage builds use multiple FROM statements to separate build and runtime environments
  • The final production image only contains runtime artifacts, not build tools
  • Build dependencies like compilers and test frameworks never enter the production image
  • This reduces attack surface by excluding potential vulnerabilities from build tools
  • Smaller images mean faster pulls, smaller storage, and reduced memory footprint
  • The build stage can use larger base images with more tools; production stage uses minimal images
5. Describe the different Docker networking drivers and their use cases.

Expected answer points:

  • bridge: Default driver for standalone containers; provides DNS resolution via docker0 interface
  • host: Removes network isolation; container uses host network directly for lower latency
  • overlay: Connects containers across multiple Docker hosts; used in Docker Swarm clusters
  • macvlan: Assigns a MAC address to each container; useful for legacy applications expecting physical network cards
  • none: Disables all networking; container is completely isolated
  • Network choice affects container-to-container communication, performance, and security isolation
6. What are the differences between named volumes, bind mounts, and tmpfs mounts?

Expected answer points:

  • Named volumes: Docker-managed persistent storage; survive container removal; best for database storage
  • Bind mounts: Map host directory into container; useful for development with live code reloading
  • tmpfs mounts: Store data in memory only; fastest option; data lost when container stops
  • Volumes can be pre-populated before container start; bind mounts reflect host filesystem exactly
  • tmpfs is ideal for caches, session data, or any data that does not need to persist
7. How does Docker Compose depends_on work, and what are its limitations?

Expected answer points:

  • depends_on defines startup order: services listed wait for dependencies to start first
  • It does NOT wait for the application inside the container to be ready
  • A database container may be running but the database not yet accepting connections
  • For production readiness, implement health checks or startup scripts that wait for dependencies
  • Use wait-for scripts or tools like dockerize, wait-for-it, or healthcheck configurations
  • Container orchestration platforms support proper dependency health tracking
8. What are restart policies, and which should you use in production?

Expected answer points:

  • no: Never restart (default behavior)
  • on-failure: Restart only when container exits with non-zero code
  • unless-stopped: Restart unless explicitly stopped; survives Docker daemon restart
  • always: Always restart; includes after Docker daemon restart
  • Production recommendation: usually unless-stopped or always for critical services
  • Consider whether you want restart after crashes that might mask underlying bugs
  • HEALTHCHECK should accompany restart policies for proper load balancer integration
9. What are the key security best practices for production Docker containers?

Expected answer points:

  • Run containers as non-root user (USER instruction in Dockerfile)
  • Use minimal base images (Alpine, distroless) to reduce attack surface
  • Pin exact image versions; never use latest tag in production
  • Scan images for vulnerabilities with tools like Trivy before deployment
  • Use read-only filesystem (--read-only flag) and tmpfs for /tmp
  • Drop all capabilities (--cap-drop=ALL) and disable privilege escalation (--security-opt=no-new-privileges)
  • Set resource limits to prevent resource exhaustion attacks
  • Use Docker secrets or external secrets managers for sensitive data; never environment variables
  • Minimize exposed ports; only expose what is strictly necessary
10. How do you debug a container that exits immediately after starting?

Expected answer points:

  • Check exit code: docker ps -a shows exit code (0, 1, 137, 139, 143, 255)
  • 137 = SIGKILL usually from OOM kill; check memory limits and application memory usage
  • 139 = SIGSEGV (segmentation fault); application bug or bad memory access
  • 143 = SIGTERM (graceful shutdown); normal during stop or restart
  • 255 = exit status out of range; often entrypoint misconfiguration
  • View logs: docker logs container_name for application output
  • Run interactively: docker run -it image_name sh to debug the entrypoint
  • Check application configuration: environment variables, dependencies, file permissions
  • Verify the CMD or ENTRYPOINT syntax; exec form vs shell form behaves differently
11. How does Docker handle container logging, and what are the best practices for log management?

Expected answer points:

  • Docker captures STDOUT and STDERR from container processes and writes them to json-file log driver by default
  • Configure log driver in daemon.json or per-container with --log-driver flag
  • Set log rotation with --log-opt max-size and --log-opt max-file to prevent disk exhaustion
  • For production, use centralized logging drivers (fluentd, gelf, awslogs) or ship logs to ELK stack, Loki, or CloudWatch
  • Application should write structured JSON logs to STDOUT for easier parsing and querying
  • Include correlation IDs in logs to trace requests across services
  • Never log sensitive data (secrets, tokens, passwords) even to stdout
  • Use log levels appropriately: ERROR for failures, WARN for degraded state, INFO for normal operations
12. What is the difference between Docker's entrypoint and cmd instructions, and when would you use each?

Expected answer points:

  • ENTRYPOINT defines the main executable that always runs when the container starts
  • CMD provides default arguments that can be overridden at runtime
  • Shell form vs exec form: shell form adds /bin/sh -c wrapper which does not handle signals properly
  • Exec form (JSON array) is preferred as it runs directly without shell wrapper, enabling proper signal handling
  • Use CMD for default arguments: CMD ["--config", "/default.conf"]
  • Use ENTRYPOINT when the container should always run as a specific executable: ENTRYPOINT ["python", "app.py"]
  • Combine both when you need a fixed executable with default parameters
  • Entrypoint can be overridden with --entrypoint flag for debugging or special use cases
13. How do you implement health checks for containers, and why are they important for orchestration?

Expected answer points:

  • HEALTHCHECK instruction tells Docker how to verify container health by running a command inside the container
  • Docker marks container unhealthy after consecutive failures matching --retries threshold
  • Health checks enable load balancers to only send traffic to healthy containers
  • Orchestrators like Kubernetes use liveness probes to restart unhealthy containers and readiness probes to remove from load balancing
  • Health check should test the actual application, not just the process: wget to health endpoint, curl to API, or custom check script
  • Set appropriate intervals and timeouts: too aggressive wastes resources, too lenient delays failure detection
  • For databases, check actual connectivity, not just port open; the database may be starting up while port is listening
14. What are the trade-offs between using COPY and volume mounts for sharing code with containers?

Expected answer points:

  • COPY bakes code into image layer; changes require rebuild and push to registry
  • Volume mounts (bind mounts) let you edit code on host and see changes immediately in container
  • COPY is for production: image contains exact code that was tested, reproducible builds
  • Bind mounts are for development: fast iteration, no rebuild needed, code may differ from production
  • Named volumes are for persistent data: database storage, state that survives container restart
  • tmpfs mounts are for sensitive data you never want persisted: tokens, session data, temporary caches
  • For development, use bind mounts for code; for production, use COPY; tmpfs only for secrets in dev
  • Performance: Bind mounts have minimal overhead; COPY creates additional image layers
15. How does Docker's layer caching mechanism work, and how can you optimize it for faster builds?

Expected answer points:

  • Each Dockerfile instruction creates a new layer that is cached if the instruction and its inputs have not changed
  • When a layer changes, Docker invalidates the cache for that layer and all subsequent layers
  • Order instructions from least to most frequently changing: base image, dependencies, source code last
  • Split RUN commands to leverage caching: npm install in separate layer before COPY source
  • Use --chown on COPY to avoid cache busting when ownership changes
  • BuildKit enables parallel layer building and better cache management
  • Use mount caches for package managers (npm, pip, maven) to persist cache across builds
  • Avoid COPY . when you only need specific files; COPY package.json first, then source
16. What are the security implications of running containers in production, and how do you mitigate them?

Expected answer points:

  • Containers share the host kernel, creating a larger attack surface than VMs
  • Run as non-root user: USER instruction in Dockerfile prevents privilege escalation
  • Use minimal base images (Alpine, distroless) to reduce attack surface and minimize CVEs
  • Pin exact versions: latest tag can introduce breaking changes or vulnerabilities silently
  • Scan images with Trivy or Snyk before deployment; integrate into CI/CD pipeline
  • Drop all capabilities: --cap-drop=ALL removes unnecessary kernel permissions
  • Use read-only filesystem: --read-only prevents writing to unexpected locations
  • Prevent privilege escalation: --security-opt=no-new-privileges stops container from gaining more privileges
  • Never store secrets in environment variables; use secrets managers or Docker secrets
  • Network isolation: Use custom bridge networks to limit inter-container communication
17. How does Docker Compose manage service dependencies and startup order?

Expected answer points:

  • depends_on directive defines startup order: service listed waits for dependencies to start first
  • depends_on only waits for container to start, not for application readiness inside container
  • A database container may be running but the database not yet accepting connections
  • For production readiness, implement health checks or startup scripts that poll for dependency readiness
  • Use tools like wait-for-it, dockerize, or custom entrypoint scripts to wait for dependencies
  • Kubernetes handles this better with init containers and readiness probes
  • Use condition: service_healthy in Compose to wait for health check to pass
  • Restart policies handle crashes but not slow-starting applications
18. What is the difference between Docker Swarm and Kubernetes for container orchestration?

Expected answer points:

  • Docker Swarm is built into Docker Engine; Kubernetes is a separate orchestration system
  • Swarm has gentler learning curve; Kubernetes has steeper but more powerful abstractions
  • Swarm uses Services and Stacks; Kubernetes uses Deployments, Services, Ingress, ConfigMaps, Secrets
  • Kubernetes has richer ecosystem: Helm for package management, operators for custom controllers
  • Swarm suits simple cases and small teams; Kubernetes scales better for large, complex deployments
  • Both support rolling updates, service discovery, load balancing, scaling
  • Kubernetes has more sophisticated scheduling: taints, tolerations, node affinity, pod priority
  • Swarm uses docker-compose.yml directly; Kubernetes uses YAML manifests
  • Both work with Docker images; Kubernetes can also use containerd and other runtimes
19. How do you handle state management in Docker containers, especially for databases and stateful services?

Expected answer points:

  • Containers are ephemeral by default; any data written inside is lost when container is removed
  • Use named volumes for persistent data that must survive container restarts and recreation
  • Database data should always go in named volumes, never in container filesystem
  • Bind mounts useful for development (live code reload) but risky for production
  • tmpfs mounts store sensitive data in memory only; data never persists to disk
  • For distributed state, use external databases, Redis, or other stateful services outside containers
  • Volume drivers can provide clustering, replication, encryption for production data needs
  • Backup volumes regularly: docker run --rm -v volume_name:/data alpine tar czf - /data > backup.tar.gz
  • Never store application state in container layer; treat containers as stateless processing units
20. What strategies would you use to optimize Docker image size and build time for production?

Expected answer points:

  • Use Alpine-based images: ~5MB vs 700MB+ for Ubuntu; smaller attack surface, faster pulls
  • Multi-stage builds exclude build tools from production image
  • Pin exact versions: never use latest; enables cache reuse, prevents unexpected changes
  • Order Dockerfile instructions from least to most frequently changing for better caching
  • Combine RUN commands to reduce layer count: RUN apt-get install && rm -rf /var/lib/apt/lists/*
  • Use .dockerignore to exclude unnecessary files (.git, node_modules, docs) from build context
  • BuildKit enables parallel builds and mount caches for package managers
  • Only COPY necessary files: COPY package*.json first, then source; not COPY . .
  • Consider distroless or scratch images for minimal footprint; only includes runtime dependencies
  • Layer ordering matters: dependencies change less often than source code, so they go first

Further Reading

Conclusion

Key Takeaways

  • Docker containers package applications with their dependencies for consistent deployment across environments
  • Images are built in layers; multi-stage builds keep production images lean
  • Named volumes persist data independent of container lifecycle
  • Docker Compose manages multi-container applications with automatic service discovery
  • Health checks enable proper monitoring and orchestrator integration
  • Restart policies handle common failure scenarios automatically
  • Security requires defense in depth: image scanning, non-root users, resource limits, and secret management

Production Readiness Checklist

# Image building
docker build -t myapp:1.0.0 --platform linux/amd64 .
docker scan myapp:1.0.0
docker run --rm -it myapp:1.0.0 --healthcheck

# Security hardening
docker run \
  --read-only \
  --user=10001 \
  --cap-drop=ALL \
  --memory=512m \
  --security-opt=no-new-privileges \
  myapp:1.0.0

# Compose validation
docker-compose config --quiet
docker-compose up -d
docker-compose ps
docker-compose logs -f

# Volume management
docker volume create myapp_data
docker inspect myapp_data
docker volume ls

Pre-Deployment Verification

# Check for vulnerabilities
trivy image myapp:1.0.0

# Verify resource limits
docker inspect myapp | grep -A 10 Memory

# Test health check
docker exec myapp wget -qO- http://localhost:3000/health

# Check logs
docker logs myapp --tail 100 --timestamps

# Monitor in real-time
docker stats myapp --no-stream

Docker simplifies application deployment by providing consistent packaging across environments. Images, containers, volumes, and networking form the foundation for any containerized architecture.

Start simple: containerize a single application, run it locally with Docker Compose, and gradually add complexity as you need it. Most teams outgrow manual Docker commands quickly and move to orchestration tools like Kubernetes, but the fundamentals covered here apply throughout.

If you want to go deeper into container orchestration, the Advanced Kubernetes guide covers custom controllers, operators, and production-grade cluster management. Helm Charts provides a templating system that makes deploying complex applications manageable.

Category

Related Posts

Docker Fundamentals

Learn Docker containerization fundamentals: images, containers, volumes, networking, and best practices for building and deploying applications.

#docker #containers #devops

Container Images: Building, Optimizing, and Distributing

Learn how Docker container images work, layer caching strategies, image optimization techniques, and how to publish your own images to container registries.

#docker #containers #devops

Container Registry: Image Storage, Scanning, and Distribution

Set up and secure container registries for storing, scanning, and distributing container images across your CI/CD pipeline and clusters.

#containers #docker #registry