Docker, Demystified: From Hello World to How It Really Works
A plain-spoken guide to Docker that goes deeper than quick starts, demystifying containers from the ground up.
Docker is everywhere, but most explanations make it sound either magical or trivial. You’ll see endless “docker run hello-world” tutorials, and yet people still ask: Is Docker just a lightweight VM? Where does my data actually go? Why does networking feel so confusing?
The truth is simpler—and stranger. Docker is just Linux features wrapped in a friendly tool. Namespaces, cgroups, overlay filesystems: that’s the real engine. Once you understand those, Docker stops being mysterious and starts being powerful.
This guide is the beginner’s A–Z of Docker: not just the commands to type, but the concepts that make containers work. If you’ve ever wanted to move past the copy-paste tutorials and finally get it, this is for you.
Why Docker Exists
Have you ever installed Apache, MySQL, and PHP directly on your PC just to get a project running? Do you remember the MAMP / WAMP / XAMPP days? Or spinning up a full Linux VM in VirtualBox, or scripting it all with Vagrant?
All of those approaches were about the same thing: giving your apps a development environment. They worked, but each had trade-offs. Native installs polluted your machine. GUI bundles were limited. VMs were heavy. Nothing felt universal.
Fast-forward to today: nearly every project you’ll find ships with a Dockerfile. That didn’t happen by accident. Docker solved a problem that none of the older tools fully cracked.
The promise is simple: lightweight environments your apps can run in, anywhere. Linux, macOS, Windows. Your laptop, a VPS, or bare metal. Install Docker, and suddenly each app lives in its own clean, purpose-built environment.
Could you do the same with virtual machines? Sure. But Docker is far lighter. You can run ten containers side-by-side without your fan screaming — something that ten full VMs would choke on. That efficiency is what made containers the default.
Docker Is Not Magic — And It’s Not a VM
Let’s start with what Docker is not.
To run a virtual machine, your CPU needs hardware virtualization support. You can check your BIOS or just try spinning up VirtualBox or VMware — no CPU support, no VM.
Docker is different:
- On Linux: no virtualization needed. Containers use built-in kernel features.
- On Windows and macOS: Docker Desktop runs a lightweight Linux VM behind the scenes, because containers depend on the Linux kernel.
📝 Note
Almost all Docker workloads in the real world run on Linux servers, where that efficiency really shines. But even if you’re on Windows or macOS, don’t worry — your machine handles the background VM, and the containers inside it run with the same efficiency as they would on Linux.
So what is Docker really using? Linux primitives. Try this:
man unshare
You’ll see: “unshare — run program in new namespaces.”
First, check the number of running processes:
ps aux | wc -l
Now create a new PID namespace and check again:
sudo unshare --pid --fork --mount-proc bash
ps aux
Where did all your processes go?
Inside that namespace you only see two: bash and ps.
Congratulations — you’ve just built your first “container.”
I put container in quotes because a full container also layers on cgroups, OverlayFS, and security restrictions. But this simple demo shows the foundation: a container starts as just a namespace.
A container is not a “mini VM.” It’s simply a process, wrapped in Linux isolation.
Why “unshare”?
By default, processes share their namespaces with their parent, all the way up the process tree. The unshare syscall, introduced in 2002, lets a process stop sharing and create a new namespace instead. The tool inherits the name directly from that syscall. Docker builds on these same Linux fundamentals.
Installing
Before continuing, if you want to follow along, you’ll need Docker installed.
On Arch Linux–based systems, it’s straightforward:
sudo pacman -S docker
Then start the service. You can either:
- Enable it permanently across reboots:
sudo systemctl enable --now docker - Or start it only for this session:
sudo systemctl start docker
A note on installation
- This site is CLI-first. Commands are portable, scriptable, and closer to how Docker really works.
- Examples use Arch Linux. For other distros, macOS, or Windows, see the official Docker installation guides.
- Prefer a GUI? Docker Desktop is available for macOS and Windows. It runs containers inside a VM, but the result is the same: containers work.
- A full cross-platform install guide would be longer than this article and would quickly go out of date — so we won’t repeat that here.
With Docker installed and running, let’s move on to what matters: using it.
Core Concepts
We could keep building up from unshare and other kernel primitives until we’ve hand-crafted Neo-Docker. But for now, let’s switch perspective and learn Docker the way you’ll actually use it.
Container Images
They are not single disk files like a VM image or system clone.
A Docker image is a stack of layers plus a manifest. When you pull an image, Docker downloads each layer that makes up that image — you can watch them appear in your console.
Think of images as packaged apps or snapshots. You pull an image, and then you run a container from that image.
Examples of Using Docker Images
For example, you could pull the Nextcloud image and run it — and immediately have a working Nextcloud server.
docker pull nextcloud
docker run -d --name nextcloud -p 8080:80 nextcloud:latest
docker ps
docker ps lists your running containers.
Now open your browser to http://localhost:8080 — you’re hosting Nextcloud.
Want another? Let’s try WordPress:
docker pull wordpress
docker run -d --name wordpress -p 8081:80 wordpress:latest
docker ps
Visit http://localhost:8081 — now you’re hosting WordPress and Nextcloud at the same time.
👉 docker run commands can have a lot of options, so I use the optional :latest tag on image names to make it clear “this is an image”. You can omit :latest.
But where are those images coming from?
Cleaning Up Docker Containers & Images
Before answering the “where do images come from?” question, here’s a quick detour that will be useful going forward: cleanup.
When you’re done experimenting, stop and remove the containers:
docker ps -a
docker stop nextcloud wordpress
docker rm nextcloud wordpress
docker ps -a
👉 docker rm only works on stopped containers. Use docker ps -a to see all containers, including those already stopped.
When containers pile up during testing, you can remove them all at once.
docker ps -a -f "status=exited"
# → lists containers with status=exited
Note: in Docker, “exited” simply means the container has stopped.
docker container prune
# → asks for confirmation, then removes all exited (stopped) containers
👉 Always check with docker ps -a first so you don’t delete something important.
And if you also want to free disk space, remove the images:
docker images
docker rmi nextcloud wordpress
docker images
👉 docker rmi will fail if the image is still in use by a container.
Since I’m untrusting, I always check before and after with docker ps -a and docker images to confirm removal. Good habit.
And just like with containers, you can prune images that aren’t in use by any container.
docker image prune # → removes dangling images only
docker image prune -a # → removes all unused images
Dangling images?
A dangling image is an image without a tag. This usually happens when you rebuild using the same tag (e.g. :latest): the new build takes the tag, and the old one becomes untagged.
Dangling images are a subset of unused images — all dangling images are unused, but not all unused images are dangling.
Docker Hub & Other Container Registries
To answer the mystery of where Docker containers come from…
Docker Hub is Docker’s default registry — think of it as Docker’s “app store.” When you type docker pull nginx, you’re actually pulling from docker.io/library/nginx. The docker.io part is implied, so you don’t see it unless you specify another registry.
Docker Hub’s website is https://hub.docker.com, but the backend address is docker.io.
There are other public registries too:
- Red Hat’s: quay.io
- GitHub’s: ghcr.io
For example, pulling directly from Quay:
docker pull quay.io/keycloak/keycloak:latest
docker run -d quay.io/keycloak/keycloak:latest start-dev
Notice how the full registry URL is part of the image name. That’s why it shows up when you list your images:
docker image ls
You can also host your own registry — popular choices include Harbor and GitLab Container Registry.
Popular Registries at a Glance
| Registry | Default? | Public/Private | Self‑host? | Notes |
|---|---|---|---|---|
| Docker Hub ( docker.io) | ✅ | Public & private repos | ❌ | The default when you docker pull with no registry prefix. |
| Quay.io | ❌ | Public & private repos | ❌ | Owned by Red Hat, good security scanning features. |
| GitHub ( ghcr.io) | ❌ | Public & private repos | ❌ | Integrated with GitHub repos; uses GitHub authentication. |
| GitLab | ❌ | Public & private repos | ✅ | Comes built into GitLab; easy for CI/CD pipelines. |
| Harbor | ❌ | Public & private repos | ✅ | CNCF project; enterprise features like replication and scanning. |
Rule of thumb:
- If you don’t specify, Docker pulls from Docker Hub.
- If you do specify, the registry URL becomes part of the image name.
How to Refer to Images
You can refer to an image in two main ways:
- By its name (the string shown under the REPOSITORY column).
- By its ID (the SHA256 hash shown under the IMAGE ID column).
An image name (also called a reference) can have up to three parts:
[REGISTRY_HOST[:PORT]/][NAMESPACE/]REPOSITORY[:TAG|@DIGEST]
REGISTRY_HOST→ defaults todocker.ioif omitted.NAMESPACE→ defaults tolibrary/for official Docker Hub images.TAG→ defaults to:latestif omitted.@DIGEST→ pins the image to an exact immutable hash.
When you pull an image, it’s saved locally. Any new container you run from that image reuses the local copy unless Docker needs to pull an update.
docker pull nginx
docker run -d -p 8082:80 nginx:latest
When running docker pull, Docker will:
- If image not found locally: download from the registry.
- If image already exists locally: check remote digest → pull update if needed, otherwise reuse local copy.
Extras:
docker runwill automatically pull the image if not found locally.- You don’t need to
docker pullfirst. - Use
docker pullwhen you want to fetch updates explicitly.
Building Your Own Images
Yes, you can DIY.
You’ve probably noticed that images are named after the app they run — but those apps run on Linux, and Linux itself comes in base images like Ubuntu, Debian, or Alpine. In Docker, images build on top of each other in a chain that ultimately ends with scratch (an empty image).
Think of it like this:
scratch → Debian → Node.js → [Your App Here]
📝 Note
Containers always share the host’s kernel. That means you can run Ubuntu or Alpine userlands side by side, but you can’t run a Windows container on a Linux host (or vice versa). Kernel mismatches can also cause compatibility issues — for example with very old or bleeding-edge kernels.
When you build your own image, you don’t usually start at the bottom. Instead, you pick a base that already provides what you need — like the progression from scratch example above.
For example, if you’re writing a Node.js app, your base image might be node:latest. By default it builds on Debian, but the Node.js team also publishes variants like node:current-alpine, which pairs the latest Node.js with Alpine Linux.
Alpine is a lightweight Linux distribution with a tiny footprint, originally built for embedded systems. It became a favorite for Docker because it keeps images small.
You can even go distroless — using Google’s Distroless project — which provides images stripped down to only the runtime libraries your app needs.
But for most cases, starting with an official language or framework image is the right balance. Going more minimal is possible, but rarely worth the extra complexity.
So How Do You Make Your Own Image?
You need Docker and a Dockerfile:
mkdir my-docker-project
cd my-docker-project
cat > Dockerfile <<'EOF'
FROM alpine:latest
CMD ["echo", "Hello from Alpine!"]
EOF
docker build -t local/my-docker-project .
docker image ls
This uses standard Bash syntax: cat > filename <<'EOF' … EOF writes everything between into a file. You could also just create the file manually with those contents.
Deprecation warning about buildx?
Docker has two image builders — the legacy builder and the newer BuildKit.
Today, docker build still uses the legacy builder, which is why you see the warning. Eventually, it will switch to BuildKit by default.
Until then, you can use BuildKit by installing the buildx subcommand from your distro’s package repo — install it just like you installed Docker.
Back to the steps above:
- Create a new folder with a
Dockerfile. - Write a simple
Dockerfilethat inherits from Alpine. - Build the image with
docker build, tagging it aslocal/my-docker-project. The.at the end means “use the current directory as the build context.” - List your images with
docker image ls(or the shorthanddocker images) — both work the same.
The image name local/my-docker-project includes a local/ namespace. This isn’t required — you could just call it my-docker-project — but using a namespace helps distinguish your custom images from official Docker Hub images (which often omit a namespace). You’re free to use any namespace you like.
If you create a Docker Hub account, you can push your images there (public or private). The command is exactly what you’d expect: docker push.
What we’ve done here is add a command to alpine:latest base image, then given it our own name.
Now run it:
docker run local/my-docker-project:latest
Output:
Hello from Alpine!
Your very first custom image just spoke back.
👉 Might be worth repeating that I use the optional :latest tag when referring to an image to make it clear “this is an image”. docker run creates containers from images.
You can also replace the CMD by appending a new one. Here are a few examples:
docker run local/my-docker-project:latest echo "Something new"
docker run local/my-docker-project:latest ls -lsa
docker run -it local/my-docker-project:latest sh
ls -lsa
exit
Containers stop automatically when their main process exits. A web server keeps running because it’s a long-lived process (always listening for requests). A simple echo or ls ends right away, so the container stops immediately.
To keep a container alive:
# First example
docker run -d local/my-docker-project:latest sleep infinity
docker ps
docker exec -it $(docker ps -l -q) sh # open a shell in the last container
ls -l
exit
docker stop $(docker ps -a -q)
# Second example using a name
docker run --name rickroll -d local/my-docker-project:latest sleep infinity
docker stop rickroll
docker ps -l -q means “show the ID of the last container only.”
In the examples above, every docker run created a new container.
- Use
docker run→ make a new container from an image. - Use
docker start→ restart an existing (stopped) container.
See all running and stopped containers:
docker ps -a
Cleanup all containers created from your image:
docker rm -f $(docker ps -a -q --filter "ancestor=local/my-docker-project")
👉 docker rm -f is essentially a “stop and remove” command — a forceful removal.
Adding Files to a Docker Container with COPY
So far, our image only ran commands. Let’s add something real — a file — which is an action that creates a new layer.
Create a simple text file:
echo "Hello from inside the container!" > message.txt
Now write a new Dockerfile:
cat > Dockerfile <<'EOF'
FROM alpine:latest
COPY message.txt /message.txt
CMD ["cat", "/message.txt"]
EOF
docker build -t local/my-docker-with-file .
docker run local/my-docker-with-file:latest
Output:
Hello from inside the container!
Here’s what happened:
FROM alpine:latest→ start from the Alpine base.COPY message.txt /message.txt→ copy a file from your build context (the current directory) into the image.CMD ["cat", "/message.txt"]→ tell the container to print the file when it runs.
The key idea: the Docker build context (.) is what gets sent to the Docker daemon when you run docker build. Any files you want available inside the image must be in that context.
You’ve just added a second meaningful layer to your image — one that includes your own files.
Image Layers Are Shared
Docker images are made of layers — in the last example you added one with COPY. (Note: CMD does not add a filesystem layer; it only writes image metadata.) Those layers are content-addressed. If two images use the same layer — say they both start from debian:bullseye — that layer is downloaded and stored only once on your system.
In principle, this saves a lot of space and bandwidth — when you update an image, Docker only pulls what changed. And if two images rely on the same base layers, those layers are pulled and stored once.
There are caveats:
- Using
:latestoften leads to repeated downloads, because “latest” can point to a new digest at any time. - It’s uncommon for different projects to align on the same base image and version.
Still, when base layers haven’t changed, Docker skips re-downloading them. That’s why pulling updates feels much faster than starting from scratch — layer reuse keeps things efficient.
The Docker Hub website now shows each image’s Dockerfile commands under Layers, and you can expand the base image’s commands too. For example, visit https://hub.docker.com/_/wordpress/tags, click any tag, and look for the Layers section — the layers of the base image must be expanded and appear above.
Local demo (guaranteed shared base):
echo 'Hello from base layer!' > base.txt
echo 'Hello from app 1!' > app1.txt
echo 'Hello from app 2!' > app2.txt
# Base image with one COPY layer
cat > Dockerfile <<'EOF'
FROM alpine:latest
COPY base.txt /base.txt
CMD ["cat", "/base.txt"]
EOF
docker build -t local/my-docker-with-file .
# App 1 on top of that base
cat > Dockerfile <<'EOF'
FROM local/my-docker-with-file:latest
COPY app1.txt /app1.txt
CMD ["cat", "/app1.txt"]
EOF
docker build -t local/my-docker-app-1 .
# App 2 on top of the same base
cat > Dockerfile <<'EOF'
FROM local/my-docker-with-file:latest
COPY app2.txt /app2.txt
CMD ["cat", "/app2.txt"]
EOF
docker build -t local/my-docker-app-2 .
docker run local/my-docker-app-1:latest # → Hello from app 1!
docker run local/my-docker-app-2:latest # → Hello from app 2!
Now compare the actual shared layers by digest:
docker image inspect local/my-docker-app-1 --format '{{range .RootFS.Layers}}{{println .}}{{end}}'
docker image inspect local/my-docker-app-2 --format '{{range .RootFS.Layers}}{{println .}}{{end}}'
The first two hashes are identical in both lists (shared base). Only the last differs (COPY app1.txt vs COPY app2.txt).
These digests look like sha256:9234e8fb04c4…. They’re generated from the contents of each layer. If you change any file and rebuild, only the layers touched by that step get a new digest.
Finally, inspect their history:
docker history local/my-docker-app-1
docker history local/my-docker-app-2
docker history shows the build steps, while docker image inspect shows the actual content digests.
An Image is a Union of Layers
Docker uses OverlayFS to stack layers into a single unified filesystem. Image layers are read-only; the container gets a writable layer on top. When you remove the container, that writable layer disappears — the base image layers remain untouched.
Let’s see the idea without Docker:
mkdir overlayfs_example
cd overlayfs_example
# Make the 4 directories used by OverlayFS
mkdir lower upper work merged
# Add a file to the lower directory (read-only layer)
echo "from lower" > lower/hello.txt
# Mount overlay: lower is read-only, upper is writable
sudo mount --types overlay \
myoverlay -o lowerdir=lower,upperdir=upper,workdir=work merged
# The merged view combines lower + upper
cat merged/hello.txt # → from lower
# Overwrite the file through the merged view
echo "from upper" > merged/hello.txt
# Check where it landed
cat upper/hello.txt # → from upper
cat lower/hello.txt # → from lower (unchanged)
# The merged view now shows the upper version
cat merged/hello.txt # → from upper
Here, lower acts like the image (read-only layers), upper is the container (writable layer), and merged is what the container sees.
That’s exactly how Docker containers work: the base image never changes, and all writes go into the ephemeral container layer.
Multiple image layers:
OverlayFS also supports stacking several lowerdirs:
sudo mount --types overlay \
myoverlay -o lowerdir=layer3:layer2:layer1,upperdir=upper,workdir=work merged
Docker uses this to build an image from many layers (layer1, layer2, …). At runtime, your container just adds one more writable layer on top.
How many layers?
The Linux kernel limits how many lowerdirs OverlayFS can stack. Most kernels allow up to 128 layers, and Docker enforces its own max (typically 125 image layers + 1 writable container layer). If you ever hit that limit, Docker will refuse to build the image.
Where are those layers?
On systems using the default overlay2 storage driver, Docker stores layers under /var/lib/docker/overlay2/. Each directory corresponds to a layer, and inside you’ll find a diff/ folder with that layer’s filesystem contents.
Now Containers, What Are They?
We’ve seen where container files come from (image layers), and how processes can be isolated (unshare). Now let’s put it together.
docker run -it alpine:latest
ps aux
You’re now inside the container’s namespace. The ps aux command should list two running processes — your shell and ps itself. This is essentially the same isolation you saw with unshare. What matters is this: you’re not running a Linux distro. This is not a VM. You’re just running a few isolated processes.
If the image isn’t cached locally, docker run will pull it first. The -it flags simply drop you into an interactive shell inside the container.
# Continued from within the container
cat /etc/os-release
uname -r
exit
uname -r
/etc/os-release will show Alpine’s release info. It looks like a full distro — but the first uname -r reveals which Linux kernel is actually running: the host’s kernel. Exit the container and run uname -r again — you’ll see the same version. Containers don’t boot their own kernel.
You are not running a full distro, but…
The distro’s files are still in all the right places, environment variables exist, and common utilities like ls, grep, and ps are available. The look-and-feel can fool you into thinking you’re in a VM, but you’re really just in a process sandbox.
If you’re still unsure what a container is, keep this in mind: it isn’t one single thing. It’s a composition of Linux features — namespaces, cgroups, capabilities, and filesystem layers — all working together.
Distinction: docker run vs docker container
docker run always creates and starts a new container. Each call gives you a fresh container (which you can confirm with docker ps -a).
Once a container exists, you can manage it with the docker container command group — or use its short forms like docker start.
If you don’t specify a name with --name, Docker generates a random one. Use docker ps -a to list all containers (running or stopped) along with their names and IDs.
docker run --name app1 local/my-docker-app-1:latest
docker start -i app1
- Both commands output the same “Hello from app 1!” text
docker runcreates and starts a new container namedapp1docker startreuses that existing container and starts it again- The
-iflag attaches your terminal so you can see the output
Cgroups (Linux Control Groups)
Linux control groups (cgroups) let you limit and account for hardware resources like CPU and memory. Docker exposes these through options such as --cpus, --memory, and --cpu-shares:
docker run --cpus=1 --memory=2g --memory-swap=3g --name my-container my-image:latest
But let’s look at how cgroups work directly in Linux:
# Run this initially, there’s no output and no error:
python -c "a = 'x' * 100_000_000"
# Make a new cgroup
sudo mkdir /sys/fs/cgroup/demo
# You created a directory, but look what appears automatically:
sudo ls -l /sys/fs/cgroup/demo/
# → control files like memory.max, cgroup.procs, etc.
# Limit memory to 50 MB
echo $((50*1024*1024)) | sudo tee /sys/fs/cgroup/demo/memory.max
# Put your shell into that cgroup
echo $$ | sudo tee /sys/fs/cgroup/demo/cgroup.procs
# Try to allocate more than 50 MB
python -c "a = 'x' * 100_000_000"
# → process killed (out of memory)
# Cleanup (must be rmdir, not rm -r)
sudo rmdir /sys/fs/cgroup/demo
Here the kernel enforces the memory limit and kills the process.
Why does this matter?
- If you’re hosting containers for clients, a runaway process can starve the system. You can cap its resources with cgroups to protect everyone else — though the real fix is always in the app.
- You can also assign fixed resources per container (e.g. one CPU core, 2 GB RAM) to guarantee predictable performance and bill fairly.
So what is a container? So far it’s…
- A namespace
- A union filesystem
- Control groups
Docker Volumes
Images and containers contain data. Image layers are read-only. Container layers are writable but vanish when the container is removed.
So what if you want data to persist — or be shared between containers?
Volumes let data live independently of containers. They can be created explicitly or on the fly.
And Docker volumes aren’t magic. See for yourself:
sudo -i # ← Need sudo access
ls -l /var/lib/docker/ # ← Docker data directory
ls -l /var/lib/docker/volumes # ← Volume directories
exit # ← Back to your user shell
- By default, volumes are local directories.
- Volumes have 64-character hexadecimal IDs (like most Docker objects).
- That ID is used to create the local directory unless you provide a name.
- If you don’t provide a name, the volume is anonymous.
- You may not have any volumes yet — let’s fix that.
docker volume create my-first-docker-volume
docker volume create
sudo ls -l /var/lib/docker/volumes
sudo ls -l /var/lib/docker/volumes/my-first-docker-volume/_data/
You should now see two volumes: one named, one anonymous. Each has a _data/ subdirectory where its contents live.
I also said you can create volumes on the fly — meaning create and attach in one command. Docker checks if the named volume exists and creates it if not.
docker run -d --name fly-container-1 -v created-on-the-fly:/srv/this-is-so-fly-1 local/my-docker-app-1:latest sleep infinity
docker ps -a
docker volume ls
docker volume ls should show a new volume named created-on-the-fly.
sudo ls -l /var/lib/docker/volumes/created-on-the-fly/_data/
# → should be empty
# Access the container, list directory contents, create a file:
docker exec -it fly-container-1 sh
ls -l /srv/this-is-so-fly-1 # → empty
echo "made this file from the container" > /srv/this-is-so-fly-1/too-much-fly.txt
cat /srv/this-is-so-fly-1/too-much-fly.txt
exit
sudo cat /var/lib/docker/volumes/created-on-the-fly/_data/too-much-fly.txt
# → shows the file created inside the container
- We created a container and a volume, and kept the container running with
sleep infinity. - Note: volume syntax is
volume_name:/path/inside/container.- If
volume_name:is absent, Docker creates an anonymous volume. - Anonymous volumes cannot be shared.
- If
- We confirmed the volume directory was empty, created a file in the container, then confirmed it appeared on the host.
Sharing a volume
docker run -d --name fly-container-2 -v created-on-the-fly:/srv/this-is-so-fly-1 local/my-docker-app-1:latest sleep infinity
docker ps -a
docker volume ls
sudo ls -l /var/lib/docker/volumes/created-on-the-fly/_data/
# → shows the file created earlier
# Access new container, list file from previous example, create a new file
docker exec -it fly-container-2 sh
ls -l /srv/this-is-so-fly-1 # → shows the earlier file
echo "from the second container" > /srv/this-is-so-fly-1/from-second.txt
exit
docker exec -it fly-container-1 sh
cat /srv/this-is-so-fly-1/from-second.txt # → visible from container 1
exit
- We created a second container but reused the same volume. Since it already existed, Docker didn’t recreate it.
- We confirmed the original file was present.
- We added a second file in container 2.
- We switched back to container 1 and confirmed it could see the new file.
Bind Mounts
We covered anonymous and named volumes.
There’s a third option: bind mounts.
In the first two cases, data lives under /var/lib/docker/volumes, managed by Docker. A bind mount is different: you pick an existing directory on the host and bind it directly into the container.
A bind mount is not a Docker-managed volume, even though you use the same -v flag.
Let’s try it with some Node.js code using the official node image:
# List current volumes, so we can confirm nothing new is created
docker volume ls
# → note the output
# CREATE THE APP
mkdir node-js-app
cat > ./node-js-app/app.js <<'EOF'
setTimeout(() => console.log('This will get written to the logs in 60 seconds.'), 60*1000)
EOF
# CREATE THE CONTAINER
# -d → run in background
# --name → name the container
# -v → bind-mount ./node-js-app → /srv/app
# node:latest → image from Docker Hub
# node /srv/… → run our script inside the container
docker run -d --name my-node-app -v "$(pwd)/node-js-app:/srv/app" node:latest node /srv/app/app.js
# Confirm no new Docker volume was created
docker volume ls
# → same output as before
# After ~60 seconds, check the logs
docker logs my-node-app
# → This will get written to the logs in 60 seconds.
When to use which?
- Bind mounts: best for local development.
- Volumes: more portable for production — Docker manages the storage, permissions, and lifecycle.
Cleaning Up Containers & Volumes
You cannot remove a volume that’s still attached to a container — even a stopped one.
Try it:
docker stop fly-container-1 fly-container-2
docker volume rm created-on-the-fly
Output:
Error response from daemon: remove created-on-the-fly: volume is in use…
So the containers must be removed first:
docker rm -f fly-container-1 fly-container-2 # ← stop & remove
docker volume rm created-on-the-fly
docker ps -a
docker volume ls
# → no containers, no volume
If you had created anonymous volumes, cleanup could be done in one command. Let’s see that:
docker run -d -v /srv/my-anonymous-volume --name creative-container-name local/my-docker-app-2:latest sleep infinity
docker run -d -v /srv/my-anonymous-volume --name another-name local/my-docker-app-2:latest sleep infinity
docker ps
# → shows both containers
docker volume ls
# → shows two unique anonymous volumes
docker rm -f -v creative-container-name another-name # ← stops, removes containers, and deletes their anonymous volumes
# Confirm they are gone
docker ps -a
docker volume ls
- Two containers are created, each with its own anonymous volume (anonymous volumes are always unique per container).
docker rm -fstops and removes containers; adding-valso deletes any anonymous volumes.- Note: named volumes are not removed with
-v— you must delete them explicitly withdocker volume rm.
Bulk cleanup:
docker volume prune
# → removes all unused volumes (named and anonymous)
😧 Heads up
When you remove anything, the data goes with it. Volumes let you remove containers without losing data — but once you remove the volume itself, the data is gone.
Cleaning up bind mounts?
Bind mounts are not Docker-managed volumes, so Docker provides no commands to remove them. They’re just regular host directories. If you want to delete the data, remove the folder yourself with:
rm -r [FOLDER_NAME]
If you don’t, the folder will remain on the host and be reused the next time you mount it.
Docker Networks
This may be my favorite section.
Commands such as ping, curl, and ip are built atop Linux networking fundamentals — and so is Docker networking.
What’s involved?
- Linux bridge devices
- Subnets
- Name resolution
- Hostname isolation
Start by listing what’s already there:
docker network ls
# → Prints three networks unless you created others (bridge, host, none)
The one named bridge is Docker’s default network. Containers are connected to this network when you don’t specify one (as in the examples so far).
Docker creates a Linux bridge device for the default network:
ip address show docker0
# → Shows the IP address and details of the bridge device
The device itself has an IP address, and every container gets an IP in the same 172.17.0.0/16 range.
There’s one caveat: the default bridge does not provide name resolution — containers can reach each other by IP only. The solution is simple: create a user-defined network with docker network create. All user-defined bridge networks provide name resolution automatically.
Name resolution (via Docker’s embedded DNS at 127.0.0.11) matches hostnames to container IPs. If no match is found, the query is forwarded to the host’s resolver and then upstream.
Create a new network:
docker network create myappnet
# → Prints the network’s 64-character hexadecimal ID
docker network ls
# → Now shows the myappnet network along with the defaults
Now create three nginx containers, so we have something to work with:
docker run -d --name nginx1 --network myappnet nginx:latest
docker run -d --name nginx2 --network myappnet nginx:latest
docker run -d --name nginx3 --network myappnet nginx:latest
docker ps
# → Shows all three containers (e.g. "Up Less than a second")
Why nginx?
Because it’s a fast download and stays running (sleep infinity not required).
Now create a troubleshooting container with netshoot:
docker run -it --rm --network myappnet --name netshoot nicolaka/netshoot:latest zsh
# → Immediately enters the container (notice the zsh prompt)
The previous docker run command has options worth noting:
-i(--interactive): keeps STDIN open so you can type commands.-t(--tty): allocates a terminal (omit it for raw output — try for yourself).--rm: removes the container after exit.
You are now dropped into netshoot, which has zsh, autosuggestions, and many networking tools.
Try the following:
ping nginx1
# → Replies from nginx1 (Ctrl+C to stop)
curl nginx1
# → Prints raw HTML (nginx default page)
ip address show eth0
# → Shows the container’s IP (likely 172.20.0.2)
ping 172.17.0.1
# → Replies from the default bridge (docker0)
ping google.com
# → Google responds (Ctrl+C to stop)
exit
Before exiting, try ping nginx2 and ping nginx3 to see their IP addresses. They should be in the same 172.20.0.0/16 range as nginx1 (different from the default bridge).
If zsh autosuggestions pop up and confuse you, press the right arrow to accept them, or just keep typing.
With multiple networks, you can ping IPs across them, but name resolution only works within the same network.
If a hostname cannot be resolved within Docker, the query is forwarded to the host, then upstream. If no IP is found, tools like ping or curl will report “Could not resolve”.
Security Boundaries
😧 Heads up
These aren’t hard concepts, but they are often overlooked. We’re not diving deep here — only learning that these features exist and why they matter. Knowing Docker’s boundaries can save you time and confusion when things break. You can skip this if you only need the basics.
Defaults you get automatically:
- Containers share the host kernel.
- Seccomp blocks ~300 syscalls by default.
- Root inside container ≠ root on host.
Containers are not virtual machines — they share the host kernel. To keep them contained, Docker relies on several Linux security features.
Linux security is multilayered, and here we’ll see how those layers show up in Docker.
A personal note…
Docker was my introduction to these features. Even without mastering them, simply knowing they exist has made logs, errors, and command-line output far easier to understand — and that alone has been a huge help in debugging.
Linux Capabilities
👉 Here, “Linux Capabilities” doesn’t mean “things Linux can do.” It’s the kernel’s list of fine-grained privilege toggles — a breakdown of root into ~40 switchable parts. See man capabilities for the full list.
On Linux, root privileges are split into ~40 fine-grained capabilities (e.g. CAP_NET_ADMIN, CAP_SYS_ADMIN). Docker containers start with a restricted set. You can adjust them per container using --cap-add and --cap-drop (docker run --help).
To see your host’s capabilities:
man capsh # “capsh – capability shell wrapper” (press q to exit)
sudo capsh --print
Look at:
- Current → should include
=ep(all in Bounding set enabled). - Bounding set → the full list of capabilities available.
Example: binding to port 80 requires NET_BIND_SERVICE:
docker run --rm --cap-drop=ALL --sysctl net.ipv4.ip_unprivileged_port_start=1024 \
python:latest python -m http.server 80
# → PermissionError: [Errno 13] Permission denied
docker run --rm --cap-drop=ALL --cap-add=NET_BIND_SERVICE \
--sysctl net.ipv4.ip_unprivileged_port_start=1024 \
python:latest python -m http.server 80
# → Server starts, no output, press Ctrl+C
# → Now you see “Serving HTTP on 0.0.0.0 port 80”
- Both runs drop all capabilities.
- The second adds only
NET_BIND_SERVICE. - That single capability makes the difference: the Python server can now bind to a privileged port.
Another quick check: sending raw packets requires NET_RAW:
docker run --rm --cap-drop=ALL nicolaka/netshoot arping -c1 1.1.1.1
# → arping: socket: Operation not permitted
docker run --rm --cap-drop=ALL --cap-add=NET_RAW nicolaka/netshoot arping -c1 1.1.1.1
# → ARPING 1.1.1.1 from 172.17.0.3 eth0
Dropping ALL then re-adding just one shows exactly what each capability unlocks.
A Few Common Capabilities
See man capabilities for the full list.
| Capability | What it unlocks |
|---|---|
NET_BIND_SERVICE | Bind to ports <1024 (e.g. run a web server on port 80). |
NET_RAW | Create raw sockets (ping, arping, packet crafting). |
NET_ADMIN | Change network interfaces, routing tables, firewall rules. |
SYS_ADMIN | Broad “catch-all” — mount/unmount filesystems, set hostname, lots more. |
SYS_PTRACE | Trace/debug processes (like strace). |
MKNOD | Create special files (device nodes). |
CHOWN | Change file ownership, even without being the file’s owner. |
👉 SYS_ADMIN is considered the most powerful — often compared to “root, but in disguise.”
Seccomp
Seccomp (secure computing mode) is a Linux kernel filter for system calls.
System calls are how programs ask the kernel to do things — open files, create processes, talk to the network, etc.
Docker applies a default seccomp profile that blocks ~40 risky syscalls (e.g. ptrace, kexec). You can override this with --security-opt.
🔗 Docker Docs — Seccomp security profiles
Check the default seccomp mode inside a container:
docker run --rm alpine sh -c 'grep ^Seccomp: /proc/1/status'
# → Seccomp: 2 (profile applied)
Now run with the unconfined profile (no syscall filtering):
docker run --rm --security-opt seccomp=unconfined alpine sh -c 'grep ^Seccomp: /proc/1/status'
# → Seccomp: 0 (unrestricted)
What does this show?
That --security-opt seccomp=... directly controls the seccomp profile Docker applies to containers.
** What is /proc?**
/proc is a virtual filesystem (not stored on disk) that the Linux kernel provides to expose information about processes and system state.
Think of it as a live dashboard you can explore like a directory tree.
Try this:
ps aux
# → snapshot of running processes
ls -lr /proc
# → same processes, each as a folder by PID
ls -l /proc/1
# → data about process 1 (the root process)
cat /proc/1/status
# → full status output for process 1
grep ^Seccomp: /proc/1/status
# → filter to just the "Seccomp" line
👉 Process 1 is the root of all others (usually systemd).
Other handy entries:
cat /proc/cpuinfo # CPU details
cat /proc/meminfo # memory usage
cat /proc/devices # kernel devices
cat /proc/partitions # storage partitions
pstree # process tree view
To fine-tune what system calls a container can make, you define a seccomp profile (a JSON file listing allowed/blocked calls).
- Docker’s default profile is generated programmatically.
- Example custom profiles are available from Docker Labs: GitHub – dockerlabs/seccomp-profiles
Run a container with your own profile:
docker run --rm --security-opt seccomp=/path/profile.json alpine sh
User namespaces & rootless containers
By default, the Docker daemon (dockerd) runs as root on the host.
And inside containers, processes also appear to run as root — but only over the container’s isolated filesystem.
That isolation breaks down if you mount a host directory into a container. Suddenly, “root” inside the container has root-level access to those host files.
- Official images are generally trustworthy, but bugs still happen.
- Images from third parties may contain sloppy or even malicious code.
- And you don’t need an extreme case like mounting
/etc— smaller oversights can still lead to privilege escalation.
This is why Docker volumes are safer than bind mounts: volumes limit how much of the host you expose.
We can go further by mapping user and group IDs, so that “root” inside a container doesn’t equal root on the host.
We’ll take care of that in three steps.
User Namespaces?
It’s not a new concept: just think users + namespaces.
We saw earlier that a process can run with a UID that doesn’t exist on the system. Files and folders can also have UIDs and GIDs that don’t map to any real account. The kernel doesn’t care — it enforces access purely by numeric ID.
At the kernel level, a user namespace changes what a UID means:
- Inside the container, a process might be UID 0 (root).
- On a host with
userns-remap, that same UID 0 is mapped to a high, unprivileged UID (like 100000). - The kernel enforces the mapping: inside the container it’s root, on the host it’s just another user.
👉 Namespaces don’t give you a “new root” — they redefine the scope of user IDs.
That’s why the distinction matters:
- Users → which identity a process runs as.
- User namespaces → where that identity is valid (container only, or also on the host).
Running with a non-root user
By default, Docker runs container processes as root inside the container’s namespace.
Unless the image sets a different USER, docker run executes the CMD line as root.
Example:
docker run -d --rm nginx:latest
On Docker Hub, you’ll see this image ends with:
CMD ["nginx", "-g", "daemon off;"]
👉 You call docker run → Docker calls the image’s CMD → run as root (UID 0) unless overridden.
Why care?
Official images (like nginx) are widely used and heavily audited. They aren’t malicious.
But running everything as root still increases risk: if a bug or exploit slips in, the process has maximum power within its namespace.
Best practice: restrict, then allow only what you need.
- It’s good hygiene.
- It forces you to understand what’s running.
- It limits the blast radius if something goes wrong.
- It’s trivial to set.
Demo: default vs. overridden user
Default user:
docker run -it --rm alpine:latest
whoami # → root
ps aux # → processes all running as UID 0
Override with 1234:
docker run -it --rm --user 1234:1234 alpine:latest
whoami # → whoami: unknown uid 1234
echo $UID # → 1234
ps aux # → processes running as UID 1234
No matching user account is required. A process can run with any UID.
Notes and pitfalls
- If an image sets a
USER, that UID is usually 1000 (first non-root user in Linux). - Your host UID may also be 1000. Overriding with
--usercan cause confusion if you assume they’re the same.
The Node.js image sets USER
The build process creates a node user (UID 1000) and sets USER node.
- When you run the image normally (
docker run node:latest node app.js), the process runs as that user. - If you start a shell (
docker run -it node:latest sh), you may still seewhoami → rootbecause the shell is launched differently. Don’t confuse the shell UID with the runtime UID of thenodeprocess.
Rule of thumb: check what UID the image sets, then override if needed.
Remapping container user IDs — userns-remap
Running containers as non-root is good, but you can go further: remap container UIDs on the host. This way, even if a process breaks out of its namespace, it won’t map to a real root user on the host.
Docker supports this with the userns-remap option.
Enable it by creating a config file and restarting Docker:
id dockremap
# → id: ‘dockremap’: no such user (expected before enabling)
sudo mkdir -p /etc/docker
printf '{\n "userns-remap": "default"\n}\n' \
| sudo tee /etc/docker/daemon.json > /dev/null
sudo systemctl restart docker.service
😧 Heads up
After enabling, all your existing images/containers seem to “disappear.” They aren’t deleted — Docker is just using a different data root.
Dockers default data root:
/var/lib/docker
Docker’s remapped data root:
/var/lib/docker/100000.100000
If you disable remap (delete /etc/docker/daemon.json and restart), Docker returns to the original data root and your containers reappear.
You can also copy data manually between locations if needed.
After restart, Docker does the following automatically:
- Creates a new system user
dockremap - Adds ranges for that user in
/etc/subuidand/etc/subgid - Creates a new data root at
/var/lib/docker/$UID.$GID
You can verify:
id dockremap
# → shows uid, gid, groups
grep dockremap /etc/passwd /etc/subuid /etc/subgid
# → dockremap entry in each file
docker ps -a
docker image ls
docker volume ls
docker network ls
# → empty or default lists
Remapping Considerations
userns-remap isn’t meant for everyday development.
Turn it on when you’re hosting containers from outside sources and you don’t fully control what runs inside them. It’s a safety net against untrusted workloads.
If you’re in DevOps and considering it for production, enable it on your own workstation too. You’ll run into the permission gotchas early — instead of after your servers are already live.
Our earlier example showed the global switch (userns-remap in daemon.json), which remaps all containers. That’s Docker’s model: either every container runs in a remapped namespace, or you run a second daemon dedicated to remapped workloads. There’s no way to selectively enable remap on a single container if the daemon itself is running without it.
When you control the software running in your containers, security is primarily a company-policy issue: developers and ops teams must ensure their software doesn’t require root on the host. In those cases, focus on choosing the correct user inside the container (--user or USER) and make sure the UIDs you assign don’t collide with host admin accounts.
Rootless Docker
Running rootless Docker is the final step in user namespacing.
It has trade-offs (see Docker’s docs), so for most teams, steps 1 and 2 already deliver the best ROI.
When rootless makes sense
- Shared hosts where you don’t know who runs what
- Running untrusted or untested images
- Regulated environments with strict “no host-root” policies
Our three steps recap:
- Non-root container user → mitigates in-container damage
- userns-remap → mitigates a container escaping its namespace
- Rootless Docker → mitigates Docker daemon or kernel exploits
Which user should run it?
- Dev laptop → your normal user.
- Server → create a dedicated non-root service account, enable lingering, and treat it like any other daemon user.
Most distros ship a setup script (/usr/bin/dockerd-rootless-setuptool.sh) with the RPM/DEB packages.
If missing, install docker-ce-rootless-extras.
On Arch Linux, use the AUR package docker-rootless-extras.
These packages/scripts:
- Set up
/etc/subuidand/etc/subgidranges - Install a
systemduser unit for rootless Docker - Provide step-by-step shell output
📝 Note
Some Docker features don’t work in rootless mode, or need extra configuration. All limitations and workarounds are listed in the Docker Docs under Known Limitations.
Final steps
- Append a subuid/subgid range to
/etc/subuidand/etc/subgid:
(replaceUSERNAME:231072:65536USERNAMEappropriately) - Enable socket activation:
systemctl --user enable --now docker.socket - If running on your personal machine, add this to
.bashrcor.zshrc:export DOCKER_HOST=unix://$XDG_RUNTIME_DIR/docker.sock
Linux Security Modules
Linux supports multiple Mandatory Access Control (MAC) frameworks, but only one can be the active (“major”) LSM at a time. Distros make different choices:
- Ubuntu / Debian → AppArmor by default (with maintained profiles)
- Fedora / RHEL / CentOS / Rocky / Alma → SELinux by default (Red Hat invests heavily here)
- openSUSE, Alpine → also ship with AppArmor
- Arch Linux → leaves it up to the user
If an LSM is enabled on the host kernel, Docker can attach an LSM profile to each container. These act as extra security policies on top of normal permissions. Even root inside a container can be denied actions if the profile forbids them.
These protections work only because containers and the host share the same Linux kernel. The kernel enforces the LSM policy on all processes, whether host or container.
| Framework | Where you’ll see it | Notes |
|---|---|---|
| SELinux | Fedora, RHEL, CentOS, Rocky, Alma | Label-based rules; very fine-grained, but complex to manage. |
| AppArmor | Ubuntu, Debian, openSUSE, Alpine | Path-based rules; easier to write/manage, less expressive than SELinux. |
| Smack | Tizen, some embedded systems | Simplified MAC, mainly for embedded/mobile. |
| TOMOYO | In mainline Linux, rarely used | Focuses on learning mode and auto-generating policies. |
| Landlock | Linux ≥ 5.13 (opt-in) | Unprivileged sandboxing; still evolving. |
Using with Docker
The LSM only applies if:
- The host kernel has that LSM enabled.
- You pass a
--security-optflag to tell Docker which profile/label to use.
# With AppArmor (Ubuntu/Debian/openSUSE)
docker run --rm --security-opt apparmor=docker-default \
alpine cat /proc/1/attr/current
# → docker-default
# With SELinux (Fedora/RHEL/Alma)
docker run --rm --security-opt label=type:container_t \
alpine cat /proc/self/attr/current
# → system_u:system_r:container_t:s0:c123,c456
I asked earlier: what is a container?
At this point we can say it’s a collection of kernel features working together:
- Namespaces → isolate processes (it’s not a VM)
- Union filesystems → stack image layers into a single view
- Control groups (cgroups) → meter and limit resources
- Volumes / bind mounts → persist or share data
- Networking → connect containers to each other and the outside world
- Security modules → capabilities, seccomp, user namespaces, AppArmor/SELinux
👉 All of this is possible only because containers and the host share the same Linux kernel. Unlike a virtual machine, there’s no separate kernel per container — the kernel enforces isolation and security boundaries for every process.
📌 Recap
- Images are made of layered filesystems.
- Containers are isolated processes with limited resources.
- Volumes, networks, and security boundaries extend what containers can do.
Putting It All Together
Time to build some real things.
The whole point of Docker is to package software, ship it anywhere, and get predictable results.
A container gives you a tested environment — but remember, there’s always one variable you don’t control: the host kernel.
You can reduce surprises by:
- Running stable LTS kernels in production
- Testing against environments that match production as closely as possible
In this section we’ll move from theory to practice:
- Package our own software
- Lock in software versions instead of relying on
:latest - Show how simple patterns (single app, multi-service, DB + app split) grow into reusable setups
This is where Docker becomes more than commands — it becomes a way of documenting and sharing system architecture.
Single Container
So far we’ve run containers as-is, like little sandboxes. We even ran our own Node.js code.
But we didn’t connect those containers to the outside world. Let’s fix that.
docker run -d --rm -p 8080:80 nginx:1.29-alpine3.22
Open your browser at http://localhost:8080 and you’ll see the Nginx default page.
This works because -p 8080:80 maps port 8080 on the host to port 80 inside the container.
The default Nginx page isn’t very useful, so let’s replace it with our own.
# Create an empty folder
mkdir my-webpage
cd my-webpage
# Create a basic HTML file
cat > index.html <<'EOF'
<!DOCTYPE html>
<html>
<head>
<title>Hey there</title>
</head>
<body>
<h1>This is my webpage</h1>
</body>
</html>
EOF
# Create a Dockerfile
cat > Dockerfile <<'EOF'
FROM nginx:1.29-alpine3.22
COPY index.html /usr/share/nginx/html/index.html
EOF
# Build and tag the image twice
docker build -t local/my-webpage:dev .
docker tag local/my-webpage:dev local/my-webpage:$(date "+%Y%m%d-%H%M%Z")
docker image ls
# ← same IMAGE ID, two different tags
Now run the container:
# Stop our previous Nginx container
docker stop $(docker ps -q -l)
docker run -d -p 8080:80 local/my-webpage:dev
Refresh your browser → you’ll see “This is my webpage”.
That example is simple, but the idea scales.
Your own Dockerfile can turn Nginx into whatever you need it to be:
- File server with caching and Brotli compression ✅
- A reverse proxy with HTTP/3 QUIC ✅
- Load balancer with round-robin ✅
And you’re not limited to Nginx:
- Ubuntu + Apache + PHP + MySQL 🔥
- Alpine + Nginx + Node.js + MongoDB 🔥
- Debian + Go + Postgres 🔥
It’s more common to run one service per container (we’ll cover that next).
But some big-name projects still bundle multiple services into a single container:
- Discourse — their Docker setup runs Nginx, Unicorn, Postgres, and Redis.
A Docker image for Discourse - GitLab Omnibus
- OpenProject all-in-one
- Phabricator
So while it’s the exception, it’s not unusual.
Here’s a quick “all-in-one” example: Ubuntu with Node.js, MongoDB, and a non-root app user.
FROM ubuntu:22.04
ENV DEBIAN_FRONTEND=noninteractive
ENV NODE_MAJOR=24
ENV MONGO_MAJOR=7.0
# Basics
RUN apt-get update && apt-get install -y --no-install-recommends \
ca-certificates curl gnupg \
&& rm -rf /var/lib/apt/lists/*
# --- Node.js 24 (NodeSource repo) ---
RUN mkdir -p /usr/share/keyrings \
&& curl -fsSL https://deb.nodesource.com/gpgkey/nodesource-repo.gpg.key \
| gpg --dearmor -o /usr/share/keyrings/nodesource.gpg \
&& echo "deb [signed-by=/usr/share/keyrings/nodesource.gpg] https://deb.nodesource.com/node_${NODE_MAJOR}.x nodistro main" \
> /etc/apt/sources.list.d/nodesource.list
# --- MongoDB 7.0 (official repo for Ubuntu Jammy) ---
RUN curl -fsSL https://pgp.mongodb.com/server-${MONGO_MAJOR}.asc \
| gpg --dearmor -o /usr/share/keyrings/mongodb-server-${MONGO_MAJOR}.gpg \
&& echo "deb [ arch=amd64,arm64 signed-by=/usr/share/keyrings/mongodb-server-${MONGO_MAJOR}.gpg ] https://repo.mongodb.org/apt/ubuntu jammy/mongodb-org/${MONGO_MAJOR} multiverse" \
> /etc/apt/sources.list.d/mongodb-org-${MONGO_MAJOR}.list
# Install Node.js + MongoDB
RUN apt-get update && apt-get install -y --no-install-recommends \
nodejs \
mongodb-org \
&& apt-get clean && rm -rf /var/lib/apt/lists/*
# Create non-root user `app`
RUN groupadd --system app && useradd --system --create-home --gid app app
# Verify versions (optional; remove for slimmer layers)
RUN node -v && mongod --version
USER app
WORKDIR /home/app
CMD ["bash"]
Why not just apt install nodejs mongodb?
Because with distro packages, you don’t control the version — apt install node in January may not be the same in May. By adding the official NodeSource and MongoDB repos, we can pin exact versions and ensure consistency.
When it’s time to upgrade, we build and test against the new versions first — and only promote to production once we know it works.
Multi-Service App
Most real-world apps aren’t just one process. They’re made of multiple services that talk to each other:
- Frontend
- Backend
- Database
Instead of packing them into one container, we’ll run each as its own container. This separation is how Docker is used in practice. It makes debugging, replacement, and scaling much easier.
Diagram
Before we dive into commands, let’s picture the architecture:
Step 1 — Create a Work Folder & Network
Every app should live in its own directory. And services need a private network to talk to each other.
mkdir multiapp
cd multiapp
docker network create multinet
Step 2 — Start the Database
We’ll use the official postgres image. By default, POSTGRES_DB = POSTGRES_USER if not specified, so the database will also be named multiapp.
docker run -d --rm \
--network multinet \
--name multiapp_db \
-e POSTGRES_USER=multiapp \
-e POSTGRES_PASSWORD=mysecretpassword \
postgres:17.6-alpine3.22
Initialize a simple table:
docker run --rm --network multinet -e PGPASSWORD=mysecretpassword \
postgres:17.6-alpine3.22 psql -h multiapp_db -U multiapp -v ON_ERROR_STOP=1 -c "
CREATE TABLE IF NOT EXISTS messages (
id SERIAL PRIMARY KEY,
data TEXT NOT NULL
);
"
Step 3 — Start the Backend
PostgREST turns a PostgreSQL database into a REST API. Here we point it at multiapp_db using the same credentials.
docker run -d --rm --name multiapp_backend --network multinet \
-e PGRST_DB_URI=postgres://multiapp:mysecretpassword@multiapp_db:5432/multiapp \
-e PGRST_DB_ANON_ROLE=multiapp \
postgrest/postgrest:latest
Step 4 — Build a Custom Frontend
Nginx will serve a simple HTML form and proxy API requests to the backend.
We need a Dockerfile, a custom config, and an HTML page.
Dockerfile:
cat > Dockerfile <<'EOF'
FROM nginx:1.29-alpine3.22
COPY index.html /usr/share/nginx/html/index.html
COPY nginx.conf /etc/nginx/conf.d/multiapp.conf
RUN rm /etc/nginx/conf.d/default.conf
EOF
nginx.conf:
cat > nginx.conf <<'EOF'
server {
listen 80;
root /usr/share/nginx/html;
index index.html;
location /api/ {
proxy_pass http://multiapp_backend:3000/;
proxy_set_header Host $host;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header Prefer "return=representation";
}
}
EOF
index.html:
cat > index.html <<'EOF'
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8" />
<title>Multiapp</title>
</head>
<body>
<section id="submit-section">
<h2>Send To Backend</h2>
<form id="form">
<input type="text" name="data" id="data" required />
<input type="submit" value="Send">
</form>
</section>
<section id="current-data">
<h2>Current Data</h2>
</section>
<section id="new-data">
<h2>Added Data</h2>
</section>
<script>
(() => {
// Load existing data
fetch('/api/messages?select=*')
.then(r => r.json())
.then(json => {
const pre = document.createElement('pre');
pre.innerText = JSON.stringify(json, null, 2);
document.getElementById('current-data').append(pre);
});
// Handle form submission
document.getElementById('form').addEventListener('submit', async (ev) => {
ev.preventDefault();
const container = document.getElementById('new-data');
const input = document.getElementById('data');
const pre = document.createElement('pre');
try {
const res = await fetch('/api/messages', {
method: 'POST',
headers: {'Content-Type': 'application/json'},
body: JSON.stringify({ data: input.value })
});
document.getElementById('form').reset();
const json = await res.json();
pre.innerText = JSON.stringify(json, null, 2);
} catch (e) {
pre.innerText = 'Oops, something went wrong. The *multiapp_backend* container may not be running.';
}
container.append(pre);
});
})()
</script>
</body>
</html>
EOF
Build and run:
docker build -t local/multiapp_frontend:dev .
docker run -d --rm \
--name multiapp_frontend \
--network multinet \
--publish 5000:80 \
local/multiapp_frontend:dev
Sanity Checks
Before opening a browser, confirm backend and DB are talking.
# Read (should return [] initially)
docker run --rm --network multinet curlimages/curl:8.10.1 \
-s 'http://multiapp_backend:3000/messages?select=*'
# → []
# Write
docker run --rm --network multinet curlimages/curl:8.10.1 -s \
-H 'Content-Type: application/json' \
-H 'Prefer: return=representation' \
-d '{"data":"hello from docker"}' \
'http://multiapp_backend:3000/messages'
# → [{"id":1,"data":"hello from docker"}]
# Read again
docker run --rm --network multinet curlimages/curl:8.10.1 \
-s 'http://multiapp_backend:3000/messages?select=*'
# → [{"id":1,"data":"hello from docker"}]
👉 Using curl is essentially the same as what our <form> will do.
Docker CLI Reminders
--rmmeans containers are removed as soon as they stop. Logs disappear with them.- Use
-d(detached) for background with logs, or-itfor live logs. - Logs are often your best friend:
docker logs multiapp_db
docker logs multiapp_backend
docker logs multiapp_frontend
Run The Commands
Now that you’ve read through the code and comments before getting to this point, go ahead and run them. Then you can open your browser at http://localhost:5000.
Play around with the setup and try to break it — that’s how you’ll really learn:
- Stop the backend or DB while the other two are running.
- Try submitting while one is down.
- See what errors show up.
- Then fix it.
This setup — three containers, one per service, talking over a Docker network — is the foundation of real multi-service deployments.
What’s Next?
In real deployments you’ll usually add:
- Volumes so data survives container restarts.
- Compose or Kubernetes to manage multiple services.
- Scaling rules (e.g. many frontends behind one backend).
But this simple 3-container demo already gives you the building blocks for everything that follows.
Docker Compose
Docker Compose is a companion to Docker, but it isn’t part of the core engine — you have to install it separately.
Be aware there are two forms:
docker compose(plugin) → the current, supported version (v2).docker-compose(hyphenated) → the older Python tool (deprecated).
Most distros now package the plugin.
For Arch Linux:
sudo pacman -S docker-compose
For Ubuntu:
sudo apt install docker-compose-plugin
Verify the installation and see the available subcommands:
docker compose --help
Why It Exists
Starting three containers with a network, environment variables, and a custom build wasn’t too bad… but you can already feel the repetition. Scale that to 10+ services and the command lines quickly become unmanageable.
That’s why Docker Compose exists.
Compose is a workflow tool layered on top of Docker. It doesn’t replace Docker — it orchestrates it. Instead of juggling long docker run commands, you describe everything in a single YAML file and bring it up with one command:
docker compose up -d
👉 This makes your setup easier to read, share, and reproduce.
Anatomy of compose.yml
Here’s a Compose file that recreates our 3-container app (database, backend, frontend) and adds two useful improvements: a healthcheck and a one-shot init service.
services:
db:
image: postgres:17.6-alpine3.22
container_name: multiapp_db
environment:
POSTGRES_USER: multiapp
POSTGRES_PASSWORD: mysecretpassword
networks:
- multinet
healthcheck:
test: ["CMD-SHELL", "pg_isready -U $$POSTGRES_USER || exit 1"]
interval: 2s
timeout: 2s
retries: 30
start_period: 5s
# One-shot service to initialize the schema
db-init:
image: postgres:17.6-alpine3.22
container_name: multiapp_db_init
environment:
PGPASSWORD: mysecretpassword
PGDATABASE: multiapp
networks:
- multinet
depends_on:
db:
condition: service_healthy
command: >
psql -h multiapp_db -U multiapp -v ON_ERROR_STOP=1 -c "
CREATE TABLE IF NOT EXISTS messages (
id SERIAL PRIMARY KEY,
data TEXT NOT NULL
);
"
restart: "no"
backend:
image: postgrest/postgrest:latest
container_name: multiapp_backend
environment:
PGRST_DB_URI: postgres://multiapp:mysecretpassword@multiapp_db:5432/multiapp
PGRST_DB_ANON_ROLE: multiapp
networks:
- multinet
depends_on:
db:
condition: service_healthy
frontend:
build: . # ← requires a Dockerfile in the same directory
container_name: multiapp_frontend
ports:
- "5000:80"
networks:
- multinet
depends_on:
- backend
networks:
multinet:
Key points:
- Healthcheck ensures Postgres is actually ready before other services connect.
db-initruns once, initializes the schema, and exits.- Service names double as hostnames on the Compose network (
backendconnects todbatmultiapp_db:5432).
Breaking It Down
-
services:
Each entry is like onedocker run.- Here we define
db,db-init,backend, andfrontend. - Compose commands (e.g.
docker compose logs db-init) always refer to the service name. - The
container_namefield is optional. We set it so we can reuse the names in configs likenginx.conf. - If not specified, Compose uses the format:
<parent-folder>_<service>_<index>. - You could also set an explicit
hostnameinstead ofcontainer_name.
- Here we define
-
image:/build:image:pulls a published image from a registry.build:builds from a localDockerfile(used here for the custom Nginx frontend).
-
environment:
Passes key-value pairs as environment variables — the same as-ewithdocker run. -
ports:
Maps host:container ports — the same as--publish/-p. -
depends_on:
Declares startup order. It ensures services launch in sequence, but it doesn’t guarantee the dependency is ready — that’s why we added ahealthcheckand the one-shotdb-initservice. -
networks:
Connects services to a common network.- By default, each service can reach others by their service name (
db,backend, etc.). - The defaults are usually enough, but you can add advanced config (aliases, driver options) if needed.
- This replaces the manual
docker network create multinetstep from earlier.
- By default, each service can reach others by their service name (
docker run vs docker compose
| docker run flag | docker-compose.yml key |
|---|---|
--name multiapp_db | container_name: multiapp_db |
-e POSTGRES_USER=multiapp | environment: |
--publish 5000:80 or -p 5000:80 | ports: |
--network multinet | networks: |
--rm | (Compose auto-cleans on down) |
docker build -t local/... | build: |
docker run postgrest/postgrest | image: postgrest/postgrest |
docker logs <container> | docker compose logs <service> |
Compose replaces a pile of flags with a readable YAML file, while keeping the underlying concepts the same.
Run It
In the same multiapp folder (with your Dockerfile and compose.yml), start everything with:
docker compose up -d
Open your browser at http://localhost:5000 — it’s the same app as before, but now defined in one file.
Check what’s running:
docker compose ps
You can also see these containers with docker ps. Notice how Compose prefixes container names with the parent folder and service name (unless you override container_name).
Stopping and Cleaning Up
Stop services but keep containers:
docker compose stop
Stop and remove containers and networks (but keep volumes and images):
docker compose down
Stop and remove everything — containers, networks, volumes, and images:
docker compose down -v --rmi all
Volumes
By default, when you remove a container its data disappears with it. That’s fine for experiments, but databases need to persist. This is where volumes come in.
With a volume, you can run docker compose down (which removes containers) and then docker compose up -d — and your data will still be there.
Defining a Volume
Add a volumes: section to your Compose file:
volumes:
multiapp_db:
Just like networks:, it looks a little bare. That means we’re using defaults.
Docker will create the volume automatically in /var/lib/docker/volumes/multiapp_db (unless you’re using userns-remap or a custom volume driver).
Mounting the Volume
Now map that volume into the PostgreSQL container. Update the db service like this:
services:
db:
image: postgres:17.6-alpine3.22
container_name: multiapp_db
environment:
POSTGRES_USER: multiapp
POSTGRES_PASSWORD: mysecretpassword
networks:
- multinet
healthcheck:
test: ["CMD-SHELL", "pg_isready -U $$POSTGRES_USER || exit 1"]
interval: 2s
timeout: 2s
retries: 30
start_period: 5s
volumes:
# Format: VOLUME_NAME:/path/in/container
- multiapp_db:/var/lib/postgresql/data
Remember to remove the volume if you need to start from scratch (docker volume rm --help).
Experiment: Persistence in Action
# Start everything
docker compose up -d
docker compose ps
# → Lists 3 running services
Open your browser at http://localhost:5000 and post some data.
Now stop and remove:
docker compose down
docker compose ps -a
# → Empty service list
Bring it back:
docker compose up -d
Refresh http://localhost:5000 — your data is still there.
Without a volume, the data would have been lost when the container was removed. With a volume, it persists across container lifecycles, exactly as you’d expect from a database.
Why Compose Matters
- One command, many services. Start or stop your whole stack with a single command.
- Portable config. Share
compose.ymland anyone can run the same setup. - Declarative. You describe what you want; Compose figures out how to make it run.
- Extendable. Add volumes, scale services, or connect extra networks — all by editing YAML.
Compose is the bridge between quick demos and real multi-service projects. It’s still Docker under the hood, but far easier to manage.
What’s Next?
You’ve now seen the basics of Compose — enough to move beyond toy setups.
From here we’ll push further:
- How to adapt Compose for development vs production (using environment variables, multiple Compose files, CI/CD, logging, and scaling).
- How Docker stacks up against alternatives like Podman, Kubernetes, and even LXC.
That’s where Docker stops being a local convenience and becomes part of a complete workflow.
📌 Recap
- A single container works for simple cases.
- Multi-service apps reveal Docker’s real strength.
- Compose and networking tie the pieces together.
Beyond Basics
Up to now we’ve stayed inside a personal dev environment. That’s safe, but the real goal is to package and ship services.
Your code won’t only run on your laptop. It needs to run on teammates’ machines, in testing environments, and in production. Each stage comes with different needs:
- Feature branches – Spin up a short-lived test deployment, demo it, then tear it down (or have it self-destruct).
- Testing/staging – Mimic production as closely as possible, but without exposing real systems or data. Sometimes you’ll need to fake APIs or seed data to keep production safe.
- Production – A different beast entirely. Here you’re thinking about resilience, scaling, and disaster recovery drills.
The solution?
Use different orchestration patterns and Compose files tailored for each environment. Name them clearly (docker-compose.dev.yml, docker-compose.test.yml, docker-compose.prod.yml) and wire them into scripts so you can switch contexts easily.
We’ll stay at a surface level here. The point isn’t to master software deployment all at once — each topic (CI/CD, monitoring, scaling) could fill its own guide. Instead, we’ll zoom out to show how Docker fits into bigger workflows, while stopping to explain the essentials you’ll actually use. These are building blocks beginners need, and even experienced developers sometimes gloss over. The goal is to give you a clear starting point and enough grounding to make sense of the bigger picture.
- To show how Docker fits into bigger workflows,
- To give you a starting point for real deployments,
- And to set up the next features and tools we’ll explore.
Let’s walk through how to set that up.
Development vs Production
When a project starts, it begins with development. This is where we map out what production should be.
But once deployed to production, environment specific tweaks are immediately necessary (passwords, API endpoints, caching, etc).
How do we handle that without hardcoding?
Environment Variables
The mysterious .env file — if you don’t look you might not even know it’s there.
It’s a file like any other, but it’s special by convention.
Frameworks look for a file named exactly .env at the root of the project folder and pass them as globals. You might have come across variables, often capitalized, that seem to have unknown origin. They may be from a .env file.
Look in your project root to see if one exists:
ls -la
# ← The `a` option prints hidden files, also known as dot files
.env should not be committed to code, but sometimes it is or it may silently be created if you run an init script.
On the command line you can run the Linux env command to see your shell’s environment variables.
env
# ← Prints all shell variables
# To print a single value, prefix the name with `$`, e.g.:
echo $USER
echo $PATH
echo $EDITOR
But those variables are not your project’s nor the container’s — they are for your host’s shell environment.
So how do variables in .env make it into your project?
You might have seen libraries literally called dotenv — they provide a convenient way to read from .env files. But often, frameworks make that automatic — which also makes it less obvious.
And Docker?
Docker Env
docker run --env allows you to set an environment variable inside the container for the running processes. Inside the container you can run env to see that variable just like on the host.
docker run --rm --env FOO=BAR alpine sh -c 'echo $FOO'
# ← Prints “BAR”
# Repeat `--env` to set more that one
docker run --rm --env FOO=BAR --env BAZ=QUX alpine env
# ← Prints all env vars including FOO and BAZ
💡 Tip
It‘s good to know that in Bash strings in single quotes are passed as-is and strings in double quotes are interpolated (resolved) before the command is sent.
Bash experiment:
echo 'echo $USER'
# ← Prints “echo $USER”
echo "echo $USER"
# ← Prints “echo your-user-name”
Notice the highlighted $USER inside the double quotes — I did not do that, the syntax highlighter did.
But docker run also has --env-file to add many all at once. If following convention, you run docker run --env-file .env.
echo "FUN=in the sun" >> .env
docker run --rm --env-file .env alpine sh -c 'echo $FUN'
# ← Prints “in the sun”
Notice we touched on two things: environment variables and the .env file.
- Your host’s shell variables are not your project’s variables
- Your project’s code and framework read from the
.envfile docker run --env-file .envincludes them inside the container
Why include them as container variables if the code reads from .env?
We might not need to, but they may be needed in more than one place.
You can include variables for more than one thing in .env: for your code, for PostgreSQL, and so on. You can use the same .env for more than one container — variables that aren’t needed are ignored.
If you prefer separation, you can create folders or name your files differently. But when adding complexity, also consider seperate projects.
Environment Specific .env
.env should not be included in your project or docker images, but here are a few clarifications.
The actual .env should not be committed, but you can commit “inert” versions such as development.env and production.env. If your project is open source you might include example.env.
How would you “enable” it?
Each environment has its own build process. The build process would run something like cp production.env .env before starting the containers.
Here’s a quick dev example:
- Developer pulls the project repo
- The
README.mdsays to runyarn devwhich does:cp dev.env .env docker compose up -d -f dev.compose.yml
In production you might have a conceptually similar process, but it could be automatic (and maybe more elaborate).
But your project might be hosted as a private repo on GitHub. GitHub uses YAML files in the .workflow folder to deploy your code. You would have YAML instructions per deployment branch.
Merging to the production branch would trigger GitHub to run the deployment and production.env would be live in that environment. And the great thing is that environment variables are versioned along with the rest of the code.
Should passwords be versioned too?
Secrets as Wallets
💡 Tip
You can skip this if this is not a priority for now.
Credentials are a different story — they need encryption. Instead of plain branch.env files, you can keep branch.sops.env files — encrypted versions that can be decrypted and merged when needed.
Encryption keeps values private, but don’t treat it as your only lock. Use a private repo or Git server as well. Encryption is the safety net, not the door.
The name sops in the file is no accident.
SOPS (originally from Mozilla, now community-maintained) solves the developer workflow problem: how do you commit config safely and let automation decrypt it later with the right keys? It combines encryption + versioning, makes editing tolerable, and supports pluggable key management. Unlike cloud-native alternatives, it’s open source and not tied to any provider — it even complements Kubernetes Secrets.
Here’s the neat Docker twist: you don’t have to provision extra software. Normally you’d think:
apt install sops
sops decrypt secrets.sops.env
But Docker lets you skip provisioning. You can just run the command from an image:
# Encrypt locally
docker run -v $PWD:/sops -w /sops chatwork/sops encrypt production-creds.env > production-creds.sops.env
git add production-creds.sops.env
git commit -m "Add production credentials"
# Decrypt on servers
docker run -e SOPS_AGE_KEY="AGE-SECRET-KEY-…" \
-v $PWD:/sops -w /sops chatwork/sops decrypt production-creds.sops.env
So in a few lines, you’ve seen:
- Docker as a tool for one-off commands
- A quick intro to SOPS
- A simple CI/CD-ready workflow
Let’s dig into the details.
Docker for One-Off Commands
Docker isn’t just for running services — it can be a drop-in replacement for installing command-line tools. Many utilities have their own Docker images, and there are even Swiss Army images that pack “everything” you might need. And if nothing fits, you can build your own.
“But isn’t running a whole container for one command overkill?”
Not really. The Alpine image is only ~8 MB uncompressed — a minimal OS in the size of a song file. Specialized images are bigger than an OS package, but thanks to shared layers the overhead isn’t as large as it looks.
Think of Docker one-offs like NPX (running npm packages without installing) or like a version manager (pyenv, NVM). You can even alias docker run commands so they feel like local binaries:
alias kubectl2="docker run --rm bitnami/kubectl"
kubectl2
# ← Prints kubectl subcommands
Here kubectl2 avoids colliding with a locally installed kubectl, letting you compare versions:
kubectl2 version
kubectl version
You can even pin multiple versions side by side:
alias kubectl_1242="docker run --rm bitnami/kubectl:1.24.2"
alias kubectl_1287="docker run --rm bitnami/kubectl:1.28.7"
kubectl_1242 version
# ← “Client Version: v1.24.2”
kubectl_1287 version
# ← “Client Version: v1.28.7”
(For context: kubectl is the Kubernetes equivalent of the docker CLI.)
A few other utilities with their own images:
jq:demisto/jqyq:linuxserver/yqhandbrake:jlesage/handbrake
And for an “everything bagel”: leodotcloud/swiss-army-knife.
SOPS Command & CI/CD Workflow
Now let’s look at the sops command. Like an iceberg, the visible CLI hides a lot beneath the surface. The two subcommands we care about most are:
sops encrypt
sops decrypt
But they won’t work unless we do two things first:
- Generate encryption keys
- Provide a
.sops.yamlconfig in the working directory
Step 1: Create a workspace and a minimal image
mkdir sops-demo && cd sops-demo
# Dockerfile
cat > Dockerfile <<'EOF'
FROM alpine
# Install `age` and `sops`
RUN apk add --no-cache age sops
EOF
docker build -t local/secrets .
📝 Note
“age… is a simple, modern and secure file encryption tool, format, and Go library… The author pronounces it [aɡe̞] with a hard g…”
Here we are use age to generate a private-public keypair — sops has built-in age support:
docker run --rm local/secrets age-keygen
# ← Prints private + public key
That was ephemeral. To keep one:
docker run --rm local/secrets age-keygen > keys.txt
Step 2: Configure SOPS
Create .sops.yaml with the saved public key:
cat > .sops.yaml <<EOF
creation_rules:
- age: $(docker run --rm -v ./keys.txt:/keys.txt local/secrets age-keygen -y /keys.txt)
EOF
No quotes around EOF here — we want shell substitution to embed the key.
Step 3: Encrypt & decrypt files
Add some test data:
echo "SOME=PASSWORD" > staging.env
Encrypt it:
docker run --rm -v $PWD:/sops -w /sops local/secrets \
sops encrypt staging.env > staging.encrypted.env
Decrypt with the private key:
docker run --rm \
-e SOPS_AGE_KEY="$(tail -n 1 keys.txt)" \
-v $PWD:/sops -w /sops local/secrets \
sops decrypt staging.encrypted.env
# ← Prints "SOME=PASSWORD"
Step 4: CI/CD integration
- Public key: safe to distribute anywhere.
- Private key: must be secret. Store it in your password manager, or in CI/CD as a secure variable.
- In Jenkins, GitLab, or GitHub Actions, only authenticated team members (and the pipeline) can access it.
This pattern is widely used: keys live in CI/CD, configs live in Git, automation decrypts only at deploy time. Not the only workflow, but a secure one that fits most teams.
Logging & Monitoring
Running containers without visibility is like flying blind. Docker gives you just enough to peek inside, but the built-in tools hit limits fast.
Container Logs
By default, Docker uses the json-file logging driver. Each container’s logs are written as JSON under /var/lib/docker/containers/.../. You can view them with:
docker logs <container>
This works fine for quick debugging, but it doesn’t scale:
- Logs remain on the host that produced them.
- Deleting a container deletes its logs.
- Searching across many containers turns into endless greps.
Logging Drivers
Docker can swap out the default log driver and stream logs elsewhere:
syslog→ forwards to the system loggerjournald→ integrates with systemd’s journalfluentd,gelf,awslogs,splunk, etc. → ship logs off-host
You can change the driver globally in /etc/docker/daemon.json or per container:
docker run --log-driver=syslog nginx
Choosing the right driver is the first step toward centralizing logs — the foundation for monitoring and alerting across your containers.
Promote Container Logs to System Logs
Using journald is a big step up from the default json-file. It keeps logs in a central system log, rotates them automatically, and lets you query with journalctl.
Configure Docker
Create /etc/docker/daemon.json (or merge into your existing config):
sudo mkdir -p /etc/docker
cat > /etc/docker/daemon.json <<'EOF'
{
"log-driver": "journald",
"log-opts": {
"tag": "[{{.Name}} — {{.ID}}]"
}
}
EOF
If you already have settings in daemon.json, merge with jq instead of overwriting:
sudo cp /etc/docker/daemon.json /etc/docker/daemon.json.bak
sudo jq '. + {
"log-driver": "journald",
"log-opts": { "tag": "[{{.Name}} — {{.ID}}]" }
}' /etc/docker/daemon.json.bak | sudo tee /etc/docker/daemon.json
Restart Docker:
sudo systemctl restart docker
Step 1: Recreate Containers
Log-driver changes apply only to new containers. Existing ones must be recreated:
docker compose up -d --force-recreate
Equivalent to:
docker compose stop
docker compose down
docker compose up -d
Step 2: View Logs
-
Still works:
docker logs <container> -
Now also in journald:
journalctl -f # follow live logs journalctl --since="10 minutes ago" journalctl # view all logs
journalctl drops you into less:
↓orPgDnto scroll/to searchqto exit
Log files live in:
/var/log/journal/<MACHINE-ID>/system.journal/var/log/journal/<MACHINE-ID>/user-<USER-ID>.journal
Archived logs have longer filenames in the same directory.
ls -lthr /var/log/journal/*/ | awk '{print $5, $6, $7, $9}'
Step 3: Quick Web UI (Dozzle)
docker run -d --name dozzle \
-p 8080:8080 \
-v /var/run/docker.sock:/var/run/docker.sock:ro \
amir20/dozzle
Open http://localhost:8080 to:
- View logs from all containers in one place
- Search logs (
Ctrl+F) or container names (Ctrl+K) - Download logs
- See memory and CPU usage
⚠️ If userns-remap is enabled, Dozzle won’t have permission to read the socket.
Troubleshoot with:
docker ps -a
docker logs dozzle
journalctl -u docker
To keep Dozzle running across reboots:
docker rm -f dozzle
docker run -d --name dozzle \
-p 8080:8080 \
-v /var/run/docker.sock:/var/run/docker.sock:ro \
--restart always \
amir20/dozzle
And ensure Docker itself starts on boot:
sudo systemctl enable --now docker
Centralized Logs
Local logs are fine, but centralizing them is better. All your hosts send logs to a single service so you have everything in one place.
Why centralize?
- Easier access and searching
- No need to SSH into hosts
- Developers can view production logs without host access
- Faster troubleshooting when everyone sees the same data
Options:
- SaaS (Logtail/BetterStack, Splunk, etc.) — feature-rich, but often expensive at scale.
- Self-hosted — fewer bells and whistles, but free and under your control.
Since this is a Docker guide, we’ll show a self-hosted stack:
- OpenObserve: central UI + log storage
- Fluent Bit: lightweight agent that ships logs from hosts to OpenObserve
Make sure your Docker log-driver is set to journald in /etc/docker/daemon.json so Fluent Bit can pick up container logs.
Start OpenObserve
docker run -d --name o2 \
-p 5080:5080 \
-e ZO_ROOT_USER_EMAIL="your@email" \
-e ZO_ROOT_USER_PASSWORD="your_password" \
-v o2-data:/data \
openobserve/openobserve:latest
Browse to http://localhost:5080 and log in.
From the UI: Data Sources → Custom → Fluent Bit.
Or go directly:
🔗 http://localhost:5080/web/ingestion/custom/logs/fluentbit
Start Fluent Bit
Create a config:
mkdir centralized-logs
cd centralized-logs
cat > fluent-bit.conf <<'EOF'
[SERVICE]
Flush 5
Log_Level warn
storage.path /buffers
[INPUT]
Name systemd
Tag journald.*
Read_From_Tail On
DB /state/systemd.db
DB.Sync Normal
[OUTPUT]
Name http
Match *
Host localhost
Port 5080
URI /api/myorg/mystream/_json
Format json
Json_date_key _timestamp
Json_date_format iso8601
HTTP_User [email protected]
HTTP_Passwd _CHANGE_ME_ # API token, not your login password
compress gzip
EOF
The myorg/mystream path defines the organization and stream in OpenObserve. Change them to create different streams.
Run Fluent Bit:
docker run -d --name fb --network host \
-v /var/log/journal:/var/log/journal:ro \
-v /run/log/journal:/run/log/journal:ro \
-v /etc/machine-id:/etc/machine-id:ro \
-v $PWD/fluent-bit.conf:/fluent-bit/etc/fluent-bit.conf:ro \
-v fb-state:/state \
-v fb-buffers:/buffers \
fluent/fluent-bit
Troubleshooting
Check containers are running:
docker ps
docker logs fb
docker logs o2
journalctl -u docker --since="10 minutes ago"
Browse to http://localhost:5080/web/logs.
Tips:
- Pick the
journaldstream in the dropdown under “Query Editor” - Click Run query (green button, top right)
- Filter with built-in fields:
_machine_id→ filter by hostcontainer_name→ filter by container
With this setup you now have a free, self-hosted, centralized log platform. Multiple hosts → one UI → searchable, filterable logs.
Monitoring Containers
Logs tell you what happened. Monitoring tells you what’s happening right now.
Docker itself exposes runtime metrics through the API and with:
docker stats
# ← live CPU, memory, network usage per container (Ctrl+C to quit)
Good for quick checks, but far too limited for real monitoring.
Terminal Dashboards
You don’t need a full Prometheus + Grafana stack just to see what’s going on. A few lightweight UIs make monitoring containers much easier:
- ctop → top-like view of containers
- lazydocker → full-screen TUI with logs, stats, restart controls
- dockly → interactive terminal UI written in Node.js
All three talk to the Docker socket and run anywhere.
# CTOP
docker run -it --rm \
-v /var/run/docker.sock:/var/run/docker.sock wrfly/ctop
# LAZYDOCKER
docker run -it --rm \
-v /var/run/docker.sock:/var/run/docker.sock lazyteam/lazydocker
# DOCKLY (requires node)
npx dockly
npx ships with Node.js. If missing, install Node.js to get node, npm, and npx.
These tools are simple demos of what monitoring can look like without heavy setup. Later, you can move on to cAdvisor, Prometheus, and Grafana for long-term metrics and dashboards.
Metrics and Tools for Monitoring
Most teams eventually add external monitoring stacks:
- cAdvisor → per-container resource metrics (CPU, memory, network, I/O)
- Prometheus + Grafana → time-series metrics + dashboards
- ELK/EFK (Elasticsearch/Fluentd/Kibana or Fluent Bit) → logs + search
- Datadog, New Relic, etc. → hosted monitoring/logging
These tools scrape metrics and logs from all nodes, giving you a single view of your systems.
Demo Setup
In this demo we’ll show Prometheus scraping from both Node Exporter (host metrics) and cAdvisor (container metrics).
A production setup usually looks like this — Prometheus running centrally, scraping exporters across multiple hosts:
Configuration
Create a work folder and two files:
mkdir monitoring-demo
cd monitoring-demo
# Prometheus config
cat > prometheus.yml <<'EOF'
global:
scrape_interval: 5s
scrape_configs:
- job_name: "node"
static_configs:
- targets: ["node-exporter:9100"]
- job_name: "cadvisor"
static_configs:
- targets: ["cadvisor:8080"]
EOF
# Docker Compose stack
cat > compose.yml <<'EOF'
services:
prometheus:
image: prom/prometheus
container_name: prometheus
ports: ["9090:9090"]
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
networks: [monitoring]
node-exporter:
image: prom/node-exporter
container_name: node-exporter
ports: ["9100:9100"]
networks: [monitoring]
cadvisor:
image: gcr.io/cadvisor/cadvisor:latest
container_name: cadvisor
ports: ["8080:8080"]
volumes:
- /:/rootfs:ro
- /var/run:/var/run:ro
- /sys:/sys:ro
- /var/lib/docker/:/var/lib/docker:ro
- /dev:/dev:ro
- /dev/disk:/dev/disk:ro
networks: [monitoring]
networks:
monitoring:
EOF
Launch
docker compose -f compose.yml up -d
docker compose ps
# ← should list 3 containers
Open in your browser:
- Node Exporter:
http://localhost:9100/metrics - cAdvisor:
http://localhost:8080/metrics - Prometheus:
http://localhost:9090
Check targets in Prometheus:
🔗 http://localhost:9090/targets
Status should be UP.
First Query
Click “Graph” in Prometheus and run:
(node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes)
- Press Execute → then Graph.
- Zoom the timeline with
-and+.
This shows host memory usage (from Node Exporter). For container metrics, try queries like:
rate(container_cpu_usage_seconds_total[5m]) * 100
container_memory_usage_bytes / 1024 / 1024
This demo is just a first step. Add Grafana on top and you’ll get beautiful dashboards, alerts, and long-term storage.
A No Nonsense Opinion
Dashboards can feel like eye candy in DevOps — and often are. They only become useful if they’re properly set up and actually used. Without that investment, they’re just pretty pictures.
Alerts can be powerful — you learn about issues before your users do. But too many alerts create noise you’ll tune out, while too few (critical-only) can make you late to act.
So ask yourself: how critical are your apps?
The truth is many companies rely on users to raise the alarm. Dedicated monitoring teams are rare, and building resilient architecture is often a better first priority than obsessing over dashboards.
Start with the basics:
- Persistent logs — containers are ephemeral; logs shouldn’t vanish when they do.
- Simple dashboards — lightweight UIs can give the whole team visibility without a steep learning curve.
From there, let your needs drive your tools. Upgrade only when you need to — not just because the latest monitoring stack looks impressive.
Scaling from the Hardware Up
Containers are just processes, and processes live on hardware.
Your CPU is like a highway with multiple lanes (cores/threads). The scheduler decides which process gets to use a lane at any given time. If lanes are busy, new jobs wait in line.
- Single-threaded process: uses only one lane, no matter how many are open.
- Multi-threaded process: can spread work across several lanes at once.
- Many single-threaded processes: if you spin up several copies, the scheduler can place them on different lanes, so they run in parallel.
Docker doesn’t change how CPUs or threads work — containers are just processes with extra isolation. But containers make it easier to run multiple copies of an app.
If your app can’t keep up with demand, you can:
- Optimize your code (often the real bottleneck).
- Add more hardware (CPU, RAM, servers).
- Run more containers (scaling out), at least until you hit other limits.
Scaling in Practice
Docker itself doesn’t scale for you — you design the architecture.
A common option you may see is:
docker compose up --scale web=3
This avoids duplicating services in your docker-compose.yml. But two caveats apply:
- Statelessness — containers must not store data locally. Use volumes or external storage.
- Load balancing — Docker does not distribute traffic between scaled containers. You need a proxy (e.g. Nginx, Traefik) in front.
A quick test: remove and recreate a container (--rm or docker compose down). If the data is still there, your service is stateless and safe to scale.
Minimal Example
Here’s a simple setup: a web service, scaled to three replicas, with Nginx acting as the load balancer.
services:
web:
image: nginx:alpine
volumes:
- ./html:/usr/share/nginx/html:ro
deploy:
replicas: 3 # works only in Swarm mode; for Compose use `--scale web=3`
proxy:
image: nginx:alpine
ports:
- "8080:80"
volumes:
- ./proxy.conf:/etc/nginx/nginx.conf:ro
depends_on:
- web
And a matching proxy.conf:
events {}
http {
upstream backend {
server web:80;
server web:80;
server web:80;
}
server {
listen 80;
location / {
proxy_pass http://backend;
}
}
}
And an HTML file:
mkdir html
cat > html/index.html <<'EOF'
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width" />
<title>Test</title>
</head>
<body>
<h1>Hello Scaling</h1>
</body>
</html>
EOF
Now, run:
docker compose up --scale web=3
The proxy distributes traffic to the web containers:
docker compose logs -f
# ← Shows logs from proxy and web containers together
Open your browser at http://localhost:8080. Place your terminal and browser side by side, then refresh many times. You’ll see the proxy passing requests to web-1, web-2, and web-3 in turn.
Wrap-Up
Scaling with Docker is not magic. It’s about understanding the limits of your code, your hardware, and your architecture. Docker gives you the tools — but scaling is always up to you.
📌 Recap
- Development ≠ production — configs must adapt.
- Secrets, logging, and monitoring add resilience.
- Scaling comes from both hardware and replicas.
Final Demystifications
Let’s clear up some last misunderstandings before we wrap up.
Docker Compose ≠ Production Orchestrator
📝 Note
Some small teams still run production with Compose successfully — it’s just not a full orchestrator.
Reality
Docker Compose was created for local development and small deployments. It lets you describe multi-container apps in a docker-compose.yml file and run them with docker compose up.
Misconception
Many treat Compose as if it were a lightweight Kubernetes or Swarm. But Compose doesn’t provide:
- Scheduling across multiple nodes
- Rolling updates
- Service discovery at scale
- Self-healing
Nuance
Compose is fine in production if you have a single host with simple workloads. But for cluster-level orchestration you need Kubernetes, Nomad, or even Docker Swarm (though Swarm is mostly legacy now).
Dockerfile ≠ Only Way to Build Images
Reality
Docker popularized the Dockerfile format, but the OCI Image Format standard means other tools can produce container images too.
Misconception
You must use the Docker daemon to build images. Not true.
Alternatives:
- Buildah → Build without a Docker daemon.
- Kaniko → Build inside a container or CI/CD system, no privileged access.
- img / Podman → Daemonless builds.
- Bazel rules_docker → High-level, reproducible builds.
Nuance
The Dockerfile syntax is portable, but Docker itself is not required to build images.
Docker ≠ Secure by Default
Reality
Docker uses Linux namespaces and cgroups to isolate processes. It’s lighter than a VM boundary.
Misconception
Container isolation = hypervisor-level security.
Not true. Containers share the host kernel — a kernel exploit can break out of all containers.
Nuance
Docker’s defaults are okay for development, but production usually adds layers:
- User namespaces (map root in container to non-root on host)
- Seccomp profiles (syscall filters)
- Capabilities dropped (restrict process powers)
- AppArmor / SELinux profiles
Analogy
- Containers → Apartments in the same building (shared infrastructure).
- VMs → Separate houses.
Docker is Not the Only One
Docker isn’t the only player in the container world:
- LXC → Older, low-level Linux containers.
- runc → The actual runtime Docker uses under the hood.
- containerd → Core container runtime (also under Docker, also standalone).
- Podman → Daemonless container engine, Docker-compatible CLI.
- Kubernetes (K8s) → The heavyweight orchestrator.
- K3s → Lightweight Kubernetes distribution for edge or small clusters.
LXC
Background
Linux Containers (LXC) is one of the earliest container technologies, originating from IBM research in the late 2000s.
Docker connection
Docker originally built on top of LXC before switching to its own runtime.
Role today
Still used in some environments (notably Proxmox), but less common for modern app deployment.
runc
Background
runc is a lightweight, low-level runtime developed by Docker and later donated to the Open Container Initiative (OCI).
What it does
It’s the actual binary that runs containers, following the OCI runtime spec.
Independence
Yes, you can run runc directly — but it’s very barebones. Most people interact with it indirectly through higher-level tools.
Distinction from containerd
Think of runc as the “executor” that spawns containers; containerd manages the lifecycle and orchestration of those runs.
containerd
Background
Docker realized the need to split its monolithic platform into parts and initiated containerd as an open source project.
What it does
containerd handles the full container lifecycle — pulling images, starting/stopping containers, managing storage and networking.
Distinction from runc
containerd calls runc to actually create and run containers. runc is the runtime; containerd is the manager.
Adoption
It’s now used under the hood by Kubernetes and Docker itself.
Podman
Background
Podman was developed by Red Hat as a daemonless alternative to Docker.
Why it exists
Security and simplicity. By removing the long-running daemon, Podman can run containers in a more rootless and composable way.
Key feature
CLI is Docker-compatible — often you can replace docker with podman.
Kubernetes (K8s)
Background
Kubernetes was created at Google and open-sourced in 2014.
Why it exists
Google needed a system to manage containers at planetary scale, evolving lessons from its internal Borg system.
Role
The heavyweight orchestrator — scaling, scheduling, service discovery, rolling updates, self-healing, etc.
K3s
Background
K3s was created by Rancher Labs (now part of SUSE).
Why it exists
To provide a lightweight Kubernetes distribution optimized for edge, IoT, and resource-constrained environments.
Role
It’s Kubernetes-compatible but strips out non-essentials to make running clusters simpler and lighter.
📌 Recap
- Docker ≠ orchestrator, Dockerfile ≠ only build path, Docker ≠ secure by default.
- Tools like LXC, runc, containerd, Podman, and Kubernetes fill other roles.
- Docker’s impact is huge, but it’s just one piece of the container ecosystem.
Wrapping Up
You’ve now seen Docker from every angle — from hello-world to images, containers, networks, security, Compose, scaling, and the broader ecosystem.
If you walk away with one idea, let it be this: Docker isn’t just commands — it’s a way of packaging and running software that fits into a much bigger story.
Where you go next depends on your needs:
- Keep using Docker for local dev and small deployments.
- Learn Kubernetes or Nomad if orchestration is on the horizon.
- Or explore Podman and containerd if you’re curious about alternatives.
Whatever path you take, you now have the foundation to understand how containers really work — and that’s the real superpower.
Discussion
This happens on our forum (Discourse), but the comments are shown here. Posting a reply requires signing in or creating an account. Follow the link if you wish to participate.