Ollama in Docker: Contain Your Brain

Until now, we have installed Ollama as a "Native Application." But for production—especially on Linux servers—Docker is the industry standard. Containerizing Ollama makes it easy to update, back up, and move between different machines without worrying about "Dependency Hell."

1. The Official Image

Ollama provides a pre-built image on Docker Hub: docker pull ollama/ollama

2. Basic Run (CPU Only)

If you just want to test it: docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

-d: Run in the background.
-v ollama:/root/.ollama: This is critical! It creates a "Persistent Volume" so your downloaded models aren't deleted when the container stops.
-p 11434:11434: Maps the Ollama port to your host.

3. The GPU Challenge

By default, a Docker container cannot "see" your GPU. You need the NVIDIA Container Toolkit.

Run command with GPU: docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

--gpus=all: This is the magic flag that lets Ollama use your RTX or Tesla graphic cards from inside the container.

4. Why Use Docker for Local AI?

A. Environment Consistency

You can build a "Stack" that includes Ollama, a Vector Database (Chroma), and a Web UI (Open WebUI) and share the whole thing with a teammate using a single docker-compose.yaml file.

B. Clean Updates

To update Ollama in Docker, you just pull the new image and restart. You don't have to worry about the installer messing up your system paths or registry keys.

C. Resource Limiting

You can tell Docker: "Only let Ollama use 8GB of RAM and 4 CPU cores." This prevents the AI from crashing your whole server if it gets stuck in a loop.

Key Takeaways

Docker is the standard for professional and server-side Ollama deployments.
The Official Image is found at ollama/ollama.
You MUST use Volumes (-v) to persist your models.
NVIDIA Container Toolkit is required for GPU acceleration inside Docker.

Module 13 Lesson 1: Running Ollama in Docker