Deploy AI Agents with Docker: A Production Guide
Getting an AI agent working in a local Python environment is one thing — keeping it running reliably in production is another. Docker solves the portability and reproducibility problems that plague AI agent deployments: dependency conflicts, environment drift, API key exposure, and inconsistent behavior across dev and prod.
This tutorial walks through containerizing a real LangChain-based AI agent, from a minimal Dockerfile to a production-grade multi-stage build with proper secret management, health checks, and logging. By the end you will have a deployable Docker image that runs the same way in every environment.
What You'll Learn#
- How to write a Dockerfile optimized for Python AI agent workloads
- How to manage API keys and secrets securely using environment variables
- How to use multi-stage builds to keep production images small
- How to add health checks and graceful shutdown to agent containers
- How to orchestrate multiple agent containers with Docker Compose
Prerequisites#
- Docker Desktop 25+ installed locally
- Python 3.10+ familiarity
- A working AI agent (we use the one from the LangChain tutorial)
- Understanding of what AI agents are
Architecture Overview#
We will containerize a FastAPI-wrapped LangChain agent. The final setup includes:
- A FastAPI service that exposes the agent over HTTP
- A Docker image built in two stages: a build stage that installs dependencies and a slim runtime stage
- A docker-compose.yml for local multi-service orchestration
- Environment-variable injection for secrets — no hardcoded keys
Step 1: The Agent Service#
Create the project structure:
ai-agent-docker/
├── app/
│ ├── __init__.py
│ ├── agent.py
│ └── main.py
├── Dockerfile
├── docker-compose.yml
├── requirements.txt
└── .env.example
The core agent (app/agent.py):
# app/agent.py
import os
from langchain_openai import ChatOpenAI
from langchain.agents import create_react_agent, AgentExecutor
from langchain_community.tools import DuckDuckGoSearchRun
from langchain_core.prompts import PromptTemplate
def build_agent() -> AgentExecutor:
llm = ChatOpenAI(
model="gpt-4o-mini",
temperature=0,
api_key=os.getenv("OPENAI_API_KEY"),
)
tools = [DuckDuckGoSearchRun()]
prompt = PromptTemplate.from_template(
"Answer the following question. Use tools if needed.\n\n"
"Question: {input}\n\n"
"Thought: {agent_scratchpad}"
)
agent = create_react_agent(llm, tools, prompt)
return AgentExecutor(agent=agent, tools=tools, verbose=False, max_iterations=5)
# Module-level singleton — initialized once at container startup
executor = build_agent()
The FastAPI wrapper (app/main.py):
# app/main.py
import logging
import signal
import sys
from contextlib import asynccontextmanager
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from app.agent import executor
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s %(levelname)s %(name)s %(message)s",
)
logger = logging.getLogger(__name__)
class QueryRequest(BaseModel):
question: str
class QueryResponse(BaseModel):
answer: str
iterations: int
@asynccontextmanager
async def lifespan(app: FastAPI):
logger.info("Agent container starting up")
yield
logger.info("Agent container shutting down")
app = FastAPI(title="AI Agent Service", lifespan=lifespan)
@app.get("/health")
async def health():
return {"status": "ok"}
@app.post("/query", response_model=QueryResponse)
async def query(request: QueryRequest):
try:
result = executor.invoke({"input": request.question})
return QueryResponse(
answer=result.get("output", ""),
iterations=result.get("intermediate_steps", []).__len__(),
)
except Exception as e:
logger.error("Agent error: %s", e)
raise HTTPException(status_code=500, detail=str(e))
def handle_sigterm(*_):
logger.info("SIGTERM received — shutting down")
sys.exit(0)
signal.signal(signal.SIGTERM, handle_sigterm)
Step 2: Requirements File#
Pin your dependencies explicitly. Floating versions are one of the most common causes of broken production deployments.
# requirements.txt
fastapi==0.115.0
uvicorn[standard]==0.30.6
langchain==0.3.0
langchain-openai==0.2.0
langchain-community==0.3.0
pydantic==2.9.0
python-dotenv==1.0.1
httpx==0.27.2
Step 3: Write the Dockerfile#
This is a multi-stage build. The builder stage installs all dependencies into a virtual environment. The runtime stage copies only the venv and your source code — no build tools, no cache, minimal attack surface.
# Dockerfile
# ---- Stage 1: Builder ----
FROM python:3.12-slim AS builder
WORKDIR /build
# Install build dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
gcc \
&& rm -rf /var/lib/apt/lists/*
# Create and activate a virtual environment
RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
# Install Python dependencies
COPY requirements.txt .
RUN pip install --upgrade pip && \
pip install --no-cache-dir -r requirements.txt
# ---- Stage 2: Runtime ----
FROM python:3.12-slim AS runtime
WORKDIR /app
# Copy the virtual environment from builder
COPY /opt/venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
# Create a non-root user for security
RUN useradd --create-home --shell /bin/bash appuser
USER appuser
# Copy application source code
COPY app/ ./app/
# Expose the service port
EXPOSE 8080
# Health check — Docker will poll this every 30s
HEALTHCHECK \
CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost:8080/health')"
# Start the server
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8080", "--workers", "2"]
Step 4: Manage Secrets with Environment Variables#
Never bake API keys into your image. Use a .env file locally and environment variables in production.
# .env.example — commit this to source control
OPENAI_API_KEY=your_openai_key_here
LOG_LEVEL=INFO
WORKERS=2
# .env — DO NOT commit this file. Add to .gitignore.
OPENAI_API_KEY=sk-proj-...
LOG_LEVEL=INFO
WORKERS=2
Add .env to your .gitignore:
echo ".env" >> .gitignore
Build and run with the env file:
docker build -t ai-agent:latest .
docker run --env-file .env -p 8080:8080 ai-agent:latest
Step 5: Docker Compose for Local Development#
When your agent needs supporting services (Redis for caching, Postgres for memory, Langfuse for tracing), Docker Compose is the right tool. See the Langfuse observability tutorial for how to add tracing.
# docker-compose.yml
version: "3.9"
services:
agent:
build: .
ports:
- "8080:8080"
env_file:
- .env
environment:
- LOG_LEVEL=INFO
healthcheck:
test: ["CMD", "python", "-c", "import urllib.request; urllib.request.urlopen('http://localhost:8080/health')"]
interval: 30s
timeout: 10s
retries: 3
start_period: 15s
restart: unless-stopped
depends_on:
redis:
condition: service_healthy
redis:
image: redis:7-alpine
ports:
- "6379:6379"
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 5s
retries: 5
volumes:
redis_data:
Start the stack:
docker compose up --build
Test the running agent:
curl -X POST http://localhost:8080/query \
-H "Content-Type: application/json" \
-d '{"question": "What is the capital of France?"}'
Step 6: Production Best Practices#
Image hardening:
# Scan your image for vulnerabilities before pushing
docker scout cves ai-agent:latest
Layer caching for faster CI builds:
Keep COPY requirements.txt and pip install as separate layers before copying source code. This way requirements are only reinstalled when requirements.txt changes.
Tagging strategy:
# Tag with git SHA for traceability
docker build -t ai-agent:$(git rev-parse --short HEAD) .
docker tag ai-agent:$(git rev-parse --short HEAD) ai-agent:latest
Resource limits in production:
# In your docker-compose.yml or Kubernetes manifest
deploy:
resources:
limits:
cpus: "1.0"
memory: 1G
reservations:
memory: 512M
Before deploying to production, review AI agent security best practices and the general deployment guide.
What's Next#
- Add distributed tracing to your containerized agent with Langfuse observability
- Build a more capable agent with LangChain multi-tool patterns
- Add human approval gates to your deployment pipeline using human-in-the-loop patterns
- Explore the LangGraph multi-agent tutorial to orchestrate multiple containerized agents
- Review the AI agent testing guide before you push to production