a harbor filled with lots of boats next to a bridge — Photo by UAVWRIGHT Productions on Unsplash

Deploy AI Agents with Docker: A Production Guide

Getting an AI agent working in a local Python environment is one thing — keeping it running reliably in production is another. Docker solves the portability and reproducibility problems that plague AI agent deployments: dependency conflicts, environment drift, API key exposure, and inconsistent behavior across dev and prod.

This tutorial walks through containerizing a real LangChain-based AI agent, from a minimal Dockerfile to a production-grade multi-stage build with proper secret management, health checks, and logging. By the end you will have a deployable Docker image that runs the same way in every environment.

What You'll Learn#

How to write a Dockerfile optimized for Python AI agent workloads
How to manage API keys and secrets securely using environment variables
How to use multi-stage builds to keep production images small
How to add health checks and graceful shutdown to agent containers
How to orchestrate multiple agent containers with Docker Compose

Prerequisites#

Docker Desktop 25+ installed locally
Python 3.10+ familiarity
A working AI agent (we use the one from the LangChain tutorial)
Understanding of what AI agents are

Architecture Overview#

We will containerize a FastAPI-wrapped LangChain agent. The final setup includes:

A FastAPI service that exposes the agent over HTTP
A Docker image built in two stages: a build stage that installs dependencies and a slim runtime stage
A docker-compose.yml for local multi-service orchestration
Environment-variable injection for secrets — no hardcoded keys

Step 1: The Agent Service#

Create the project structure:

ai-agent-docker/
├── app/
│   ├── __init__.py
│   ├── agent.py
│   └── main.py
├── Dockerfile
├── docker-compose.yml
├── requirements.txt
└── .env.example

The core agent (app/agent.py):

# app/agent.py
import os
from langchain_openai import ChatOpenAI
from langchain.agents import create_react_agent, AgentExecutor
from langchain_community.tools import DuckDuckGoSearchRun
from langchain_core.prompts import PromptTemplate

def build_agent() -> AgentExecutor:
    llm = ChatOpenAI(
        model="gpt-4o-mini",
        temperature=0,
        api_key=os.getenv("OPENAI_API_KEY"),
    )
    tools = [DuckDuckGoSearchRun()]
    prompt = PromptTemplate.from_template(
        "Answer the following question. Use tools if needed.\n\n"
        "Question: {input}\n\n"
        "Thought: {agent_scratchpad}"
    )
    agent = create_react_agent(llm, tools, prompt)
    return AgentExecutor(agent=agent, tools=tools, verbose=False, max_iterations=5)

# Module-level singleton — initialized once at container startup
executor = build_agent()

The FastAPI wrapper (app/main.py):

# app/main.py
import logging
import signal
import sys
from contextlib import asynccontextmanager
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from app.agent import executor

logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s %(levelname)s %(name)s %(message)s",
)
logger = logging.getLogger(__name__)

class QueryRequest(BaseModel):
    question: str

class QueryResponse(BaseModel):
    answer: str
    iterations: int

@asynccontextmanager
async def lifespan(app: FastAPI):
    logger.info("Agent container starting up")
    yield
    logger.info("Agent container shutting down")

app = FastAPI(title="AI Agent Service", lifespan=lifespan)

@app.get("/health")
async def health():
    return {"status": "ok"}

@app.post("/query", response_model=QueryResponse)
async def query(request: QueryRequest):
    try:
        result = executor.invoke({"input": request.question})
        return QueryResponse(
            answer=result.get("output", ""),
            iterations=result.get("intermediate_steps", []).__len__(),
        )
    except Exception as e:
        logger.error("Agent error: %s", e)
        raise HTTPException(status_code=500, detail=str(e))

def handle_sigterm(*_):
    logger.info("SIGTERM received — shutting down")
    sys.exit(0)

signal.signal(signal.SIGTERM, handle_sigterm)

Step 2: Requirements File#

Pin your dependencies explicitly. Floating versions are one of the most common causes of broken production deployments.

# requirements.txt
fastapi==0.115.0
uvicorn[standard]==0.30.6
langchain==0.3.0
langchain-openai==0.2.0
langchain-community==0.3.0
pydantic==2.9.0
python-dotenv==1.0.1
httpx==0.27.2

Step 3: Write the Dockerfile#

This is a multi-stage build. The builder stage installs all dependencies into a virtual environment. The runtime stage copies only the venv and your source code — no build tools, no cache, minimal attack surface.

# Dockerfile
# ---- Stage 1: Builder ----
FROM python:3.12-slim AS builder

WORKDIR /build

# Install build dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
    gcc \
    && rm -rf /var/lib/apt/lists/*

# Create and activate a virtual environment
RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"

# Install Python dependencies
COPY requirements.txt .
RUN pip install --upgrade pip && \
    pip install --no-cache-dir -r requirements.txt

# ---- Stage 2: Runtime ----
FROM python:3.12-slim AS runtime

WORKDIR /app

# Copy the virtual environment from builder
COPY --from=builder /opt/venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"

# Create a non-root user for security
RUN useradd --create-home --shell /bin/bash appuser
USER appuser

# Copy application source code
COPY --chown=appuser:appuser app/ ./app/

# Expose the service port
EXPOSE 8080

# Health check — Docker will poll this every 30s
HEALTHCHECK --interval=30s --timeout=10s --start-period=15s --retries=3 \
    CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost:8080/health')"

# Start the server
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8080", "--workers", "2"]

Terminal showing a successful Docker build and container health check passing

Step 4: Manage Secrets with Environment Variables#

Never bake API keys into your image. Use a .env file locally and environment variables in production.

# .env.example — commit this to source control
OPENAI_API_KEY=your_openai_key_here
LOG_LEVEL=INFO
WORKERS=2

# .env — DO NOT commit this file. Add to .gitignore.
OPENAI_API_KEY=sk-proj-...
LOG_LEVEL=INFO
WORKERS=2

Add .env to your .gitignore:

echo ".env" >> .gitignore

Build and run with the env file:

docker build -t ai-agent:latest .
docker run --env-file .env -p 8080:8080 ai-agent:latest

Step 5: Docker Compose for Local Development#

When your agent needs supporting services (Redis for caching, Postgres for memory, Langfuse for tracing), Docker Compose is the right tool. See the Langfuse observability tutorial for how to add tracing.

# docker-compose.yml
version: "3.9"

services:
  agent:
    build: .
    ports:
      - "8080:8080"
    env_file:
      - .env
    environment:
      - LOG_LEVEL=INFO
    healthcheck:
      test: ["CMD", "python", "-c", "import urllib.request; urllib.request.urlopen('http://localhost:8080/health')"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 15s
    restart: unless-stopped
    depends_on:
      redis:
        condition: service_healthy

  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 10s
      timeout: 5s
      retries: 5

volumes:
  redis_data:

Start the stack:

docker compose up --build

Test the running agent:

curl -X POST http://localhost:8080/query \
  -H "Content-Type: application/json" \
  -d '{"question": "What is the capital of France?"}'

Step 6: Production Best Practices#

Image hardening:

# Scan your image for vulnerabilities before pushing
docker scout cves ai-agent:latest

Layer caching for faster CI builds: Keep COPY requirements.txt and pip install as separate layers before copying source code. This way requirements are only reinstalled when requirements.txt changes.

Tagging strategy:

# Tag with git SHA for traceability
docker build -t ai-agent:$(git rev-parse --short HEAD) .
docker tag ai-agent:$(git rev-parse --short HEAD) ai-agent:latest

Resource limits in production:

# In your docker-compose.yml or Kubernetes manifest
deploy:
  resources:
    limits:
      cpus: "1.0"
      memory: 1G
    reservations:
      memory: 512M

Before deploying to production, review AI agent security best practices and the general deployment guide.

What's Next#

Add distributed tracing to your containerized agent with Langfuse observability
Build a more capable agent with LangChain multi-tool patterns
Add human approval gates to your deployment pipeline using human-in-the-loop patterns
Explore the LangGraph multi-agent tutorial to orchestrate multiple containerized agents
Review the AI agent testing guide before you push to production

Deploy AI Agents with Docker: A Production Guide

What You'll Learn#

How to write a Dockerfile optimized for Python AI agent workloads
How to manage API keys and secrets securely using environment variables
How to use multi-stage builds to keep production images small
How to add health checks and graceful shutdown to agent containers
How to orchestrate multiple agent containers with Docker Compose

Prerequisites#

Docker Desktop 25+ installed locally
Python 3.10+ familiarity
A working AI agent (we use the one from the LangChain tutorial)
Understanding of what AI agents are

Architecture Overview#

We will containerize a FastAPI-wrapped LangChain agent. The final setup includes:

A FastAPI service that exposes the agent over HTTP
A Docker image built in two stages: a build stage that installs dependencies and a slim runtime stage
A docker-compose.yml for local multi-service orchestration
Environment-variable injection for secrets — no hardcoded keys

Step 1: The Agent Service#

Create the project structure:

ai-agent-docker/
├── app/
│   ├── __init__.py
│   ├── agent.py
│   └── main.py
├── Dockerfile
├── docker-compose.yml
├── requirements.txt
└── .env.example

The core agent (app/agent.py):

# app/agent.py
import os
from langchain_openai import ChatOpenAI
from langchain.agents import create_react_agent, AgentExecutor
from langchain_community.tools import DuckDuckGoSearchRun
from langchain_core.prompts import PromptTemplate

def build_agent() -> AgentExecutor:
    llm = ChatOpenAI(
        model="gpt-4o-mini",
        temperature=0,
        api_key=os.getenv("OPENAI_API_KEY"),
    )
    tools = [DuckDuckGoSearchRun()]
    prompt = PromptTemplate.from_template(
        "Answer the following question. Use tools if needed.\n\n"
        "Question: {input}\n\n"
        "Thought: {agent_scratchpad}"
    )
    agent = create_react_agent(llm, tools, prompt)
    return AgentExecutor(agent=agent, tools=tools, verbose=False, max_iterations=5)

# Module-level singleton — initialized once at container startup
executor = build_agent()

The FastAPI wrapper (app/main.py):

# app/main.py
import logging
import signal
import sys
from contextlib import asynccontextmanager
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from app.agent import executor

logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s %(levelname)s %(name)s %(message)s",
)
logger = logging.getLogger(__name__)

class QueryRequest(BaseModel):
    question: str

class QueryResponse(BaseModel):
    answer: str
    iterations: int

@asynccontextmanager
async def lifespan(app: FastAPI):
    logger.info("Agent container starting up")
    yield
    logger.info("Agent container shutting down")

app = FastAPI(title="AI Agent Service", lifespan=lifespan)

@app.get("/health")
async def health():
    return {"status": "ok"}

@app.post("/query", response_model=QueryResponse)
async def query(request: QueryRequest):
    try:
        result = executor.invoke({"input": request.question})
        return QueryResponse(
            answer=result.get("output", ""),
            iterations=result.get("intermediate_steps", []).__len__(),
        )
    except Exception as e:
        logger.error("Agent error: %s", e)
        raise HTTPException(status_code=500, detail=str(e))

def handle_sigterm(*_):
    logger.info("SIGTERM received — shutting down")
    sys.exit(0)

signal.signal(signal.SIGTERM, handle_sigterm)

Step 2: Requirements File#

Pin your dependencies explicitly. Floating versions are one of the most common causes of broken production deployments.

# requirements.txt
fastapi==0.115.0
uvicorn[standard]==0.30.6
langchain==0.3.0
langchain-openai==0.2.0
langchain-community==0.3.0
pydantic==2.9.0
python-dotenv==1.0.1
httpx==0.27.2

Step 3: Write the Dockerfile#

# Dockerfile
# ---- Stage 1: Builder ----
FROM python:3.12-slim AS builder

WORKDIR /build

# Install build dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
    gcc \
    && rm -rf /var/lib/apt/lists/*

# Create and activate a virtual environment
RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"

# Install Python dependencies
COPY requirements.txt .
RUN pip install --upgrade pip && \
    pip install --no-cache-dir -r requirements.txt

# ---- Stage 2: Runtime ----
FROM python:3.12-slim AS runtime

WORKDIR /app

# Copy the virtual environment from builder
COPY --from=builder /opt/venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"

# Create a non-root user for security
RUN useradd --create-home --shell /bin/bash appuser
USER appuser

# Copy application source code
COPY --chown=appuser:appuser app/ ./app/

# Expose the service port
EXPOSE 8080

# Health check — Docker will poll this every 30s
HEALTHCHECK --interval=30s --timeout=10s --start-period=15s --retries=3 \
    CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost:8080/health')"

# Start the server
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8080", "--workers", "2"]

Terminal showing a successful Docker build and container health check passing

Step 4: Manage Secrets with Environment Variables#

Never bake API keys into your image. Use a .env file locally and environment variables in production.

# .env.example — commit this to source control
OPENAI_API_KEY=your_openai_key_here
LOG_LEVEL=INFO
WORKERS=2

# .env — DO NOT commit this file. Add to .gitignore.
OPENAI_API_KEY=sk-proj-...
LOG_LEVEL=INFO
WORKERS=2

Add .env to your .gitignore:

echo ".env" >> .gitignore

Build and run with the env file:

docker build -t ai-agent:latest .
docker run --env-file .env -p 8080:8080 ai-agent:latest

Step 5: Docker Compose for Local Development#

# docker-compose.yml
version: "3.9"

services:
  agent:
    build: .
    ports:
      - "8080:8080"
    env_file:
      - .env
    environment:
      - LOG_LEVEL=INFO
    healthcheck:
      test: ["CMD", "python", "-c", "import urllib.request; urllib.request.urlopen('http://localhost:8080/health')"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 15s
    restart: unless-stopped
    depends_on:
      redis:
        condition: service_healthy

  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 10s
      timeout: 5s
      retries: 5

volumes:
  redis_data:

Start the stack:

docker compose up --build

Test the running agent:

curl -X POST http://localhost:8080/query \
  -H "Content-Type: application/json" \
  -d '{"question": "What is the capital of France?"}'

Step 6: Production Best Practices#

Image hardening:

# Scan your image for vulnerabilities before pushing
docker scout cves ai-agent:latest

Tagging strategy:

# Tag with git SHA for traceability
docker build -t ai-agent:$(git rev-parse --short HEAD) .
docker tag ai-agent:$(git rev-parse --short HEAD) ai-agent:latest

Resource limits in production:

# In your docker-compose.yml or Kubernetes manifest
deploy:
  resources:
    limits:
      cpus: "1.0"
      memory: 1G
    reservations:
      memory: 512M

Before deploying to production, review AI agent security best practices and the general deployment guide.

What's Next#

Add distributed tracing to your containerized agent with Langfuse observability
Build a more capable agent with LangChain multi-tool patterns
Add human approval gates to your deployment pipeline using human-in-the-loop patterns
Explore the LangGraph multi-agent tutorial to orchestrate multiple containerized agents
Review the AI agent testing guide before you push to production

Deploy AI Agents with Docker (Prod Guide)

Deploy AI Agents with Docker: A Production Guide

What You'll Learn#

Prerequisites#

Architecture Overview#

Step 1: The Agent Service#

Step 2: Requirements File#

Step 3: Write the Dockerfile#

Step 4: Manage Secrets with Environment Variables#

Step 5: Docker Compose for Local Development#

Step 6: Production Best Practices#

What's Next#

Deploy AI Agents with Docker (Prod Guide)

Deploy AI Agents with Docker: A Production Guide

What You'll Learn#

Prerequisites#

Architecture Overview#

Step 1: The Agent Service#

Step 2: Requirements File#

Step 3: Write the Dockerfile#

Step 4: Manage Secrets with Environment Variables#

Step 5: Docker Compose for Local Development#

Step 6: Production Best Practices#

What's Next#