Build an AI Agent with the OpenAI Agents SDK
OpenAI's Agents SDK is the company's official framework for building production-grade AI agents. Released in early 2025, it brings together tools, handoffs, guardrails, and tracing into a single cohesive Python library — backed directly by OpenAI's own engineering team.
In this tutorial you will build a fully functional research assistant agent that can search the web, summarize findings, and hand off to a specialized writer agent to produce final reports. By the end you will understand the core primitives of the SDK: Agents, Tools, Handoffs, and the Runner that executes them.
What You'll Learn#
- How to install and configure the OpenAI Agents SDK
- How to define tools using the
@function_tooldecorator - How to create multiple agents with distinct roles and system prompts
- How to orchestrate handoffs between agents using the
handoff()primitive - How to use the built-in tracing system to debug and inspect agent runs
Prerequisites#
- Python 3.10 or higher installed
- An OpenAI API key (set as
OPENAI_API_KEYenvironment variable) - Basic understanding of AI agents and how they use tool use
- Familiarity with async Python (we use
asyncio)
Step 1: Project Setup#
Start by creating a virtual environment and installing the SDK. OpenAI recommends using uv for fast dependency management, but plain pip works equally well.
mkdir openai-agent-demo && cd openai-agent-demo
python -m venv .venv && source .venv/bin/activate
# Install the Agents SDK and supporting libraries
pip install openai-agents httpx python-dotenv
Create a .env file at the project root:
OPENAI_API_KEY=sk-...your-key-here...
Then create your main script file agent.py. At the top, load the environment variables before importing anything else:
import os
from dotenv import load_dotenv
load_dotenv() # Load OPENAI_API_KEY from .env
Step 2: Define Your Tools#
Tools give agents the ability to take actions beyond text generation. In the OpenAI Agents SDK, you define tools by decorating regular Python functions with @function_tool. The SDK automatically extracts the docstring and type annotations to build the JSON schema that is sent to the model.
import httpx
from agents import function_tool
@function_tool
def search_web(query: str) -> str:
"""Search the web for recent information about a topic.
Args:
query: The search query string to look up.
Returns:
A plain-text summary of the top search results.
"""
# In a real implementation, call a search API (Tavily, Brave, etc.)
# For demonstration we use a mock response
response = httpx.get(
"https://api.duckduckgo.com/",
params={"q": query, "format": "json", "no_html": 1},
timeout=10,
)
data = response.json()
abstract = data.get("AbstractText", "")
if abstract:
return abstract
return f"No direct summary found for: {query}. Try refining the query."
@function_tool
def calculate(expression: str) -> str:
"""Evaluate a mathematical expression safely.
Args:
expression: A math expression like '(42 * 3) / 7 + 100'.
Returns:
The numeric result as a string.
"""
try:
# Restrict to safe evaluation — no builtins
result = eval(expression, {"__builtins__": {}}, {})
return str(result)
except Exception as e:
return f"Error evaluating expression: {e}"
Notice that the docstrings are important: the model reads them to understand when and how to call each tool.
Step 3: Create Specialized Agents#
The power of the Agents SDK is composing multiple agents with distinct roles. Here we build two: a ResearchAgent that gathers information, and a WriterAgent that transforms research into polished prose.
from agents import Agent
# Writer agent — receives research notes and formats them
writer_agent = Agent(
name="WriterAgent",
instructions="""You are an expert technical writer who specializes in clear,
engaging explanations of complex topics. When you receive research notes,
you transform them into well-structured, reader-friendly content.
Structure your output with:
- An engaging introduction
- Clear numbered or bulleted sections
- A concise conclusion with key takeaways
Write in a professional but approachable tone. Avoid jargon unless necessary.""",
model="gpt-4o",
)
# Research agent — the primary agent with tools and handoff capability
research_agent = Agent(
name="ResearchAgent",
instructions="""You are a research assistant. Your job is to:
1. Search for accurate, up-to-date information using the search_web tool
2. Use the calculate tool when numerical analysis is needed
3. Compile comprehensive research notes
4. Hand off to the WriterAgent when you have gathered enough information
Always verify facts by searching multiple angles.
When your research is complete, transfer to WriterAgent.""",
model="gpt-4o",
tools=[search_web, calculate],
handoffs=[writer_agent], # Declare which agents this agent can hand off to
)
The handoffs parameter tells the SDK that research_agent is allowed to transfer control to writer_agent. The SDK adds a special transfer_to_writeragent tool to the research agent's tool list automatically.
Step 4: Run the Agent Pipeline#
The Runner class manages the execution loop. It handles the agentic loop — calling the model, executing tools, processing results, and deciding when to stop or hand off.
import asyncio
from agents import Runner
async def main():
# The Runner manages the full agentic conversation
result = await Runner.run(
starting_agent=research_agent,
input="Research the current state of AI agent frameworks in 2025 and 2026. "
"Include information about OpenAI Agents SDK, LangChain, and CrewAI. "
"Then write a concise 300-word report on the topic.",
)
# The final output is the last agent's text response
print("=== Final Report ===")
print(result.final_output)
# You can also inspect which agents ran and what tools were called
print("\n=== Agent Run Summary ===")
for item in result.new_items:
print(f" [{item.type}] {getattr(item, 'agent', {}).name if hasattr(item, 'agent') else ''}")
if __name__ == "__main__":
asyncio.run(main())
Run it:
python agent.py
Expected output will show the ResearchAgent calling search_web multiple times, then handing off to WriterAgent, which produces the final formatted report.
Step 5: Enabling Tracing#
One of the most useful features of the Agents SDK is built-in tracing. Every run is automatically recorded to OpenAI's trace storage, accessible in the OpenAI Platform dashboard. You can inspect every tool call, model response, and handoff in a visual timeline.
from agents import Runner, RunConfig, trace
async def main_with_tracing():
# Wrap your run in a trace context to group related runs
with trace("research-and-write-pipeline"):
result = await Runner.run(
starting_agent=research_agent,
input="What are the key differences between ReAct and Plan-and-Execute agent patterns?",
run_config=RunConfig(
workflow_name="Research Report Generator",
trace_metadata={"user_id": "demo-user", "session": "tutorial-1"},
),
)
print(result.final_output)
asyncio.run(main_with_tracing())
After running, visit platform.openai.com/traces to see a complete flame graph of your agent's execution — including latency for each step, token usage, and the full conversation history.
Step 6: Adding Guardrails#
Guardrails let you validate inputs and outputs to prevent misuse or unexpected behavior. Input guardrails run before the agent processes user messages; output guardrails run before responses are returned.
from agents import Agent, input_guardrail, output_guardrail, GuardrailFunctionOutput
@input_guardrail
async def block_harmful_requests(ctx, agent, input_text):
"""Block requests that ask for harmful content."""
harmful_keywords = ["hack", "bypass", "jailbreak", "exploit"]
if any(kw in input_text.lower() for kw in harmful_keywords):
return GuardrailFunctionOutput(
output_info={"blocked": True},
tripwire_triggered=True, # This stops the agent run
)
return GuardrailFunctionOutput(output_info={"blocked": False}, tripwire_triggered=False)
safe_research_agent = Agent(
name="SafeResearchAgent",
instructions="You are a helpful research assistant.",
model="gpt-4o",
tools=[search_web],
input_guardrails=[block_harmful_requests],
)
What's Next#
You now have a solid foundation for building production agents with the OpenAI Agents SDK. Here are the best next steps:
- Compare frameworks: Read the OpenAI Agents SDK vs LangChain comparison to understand when to choose each
- Explore more tools: See the OpenAI Agents SDK directory entry for a full capabilities overview
- Multi-agent orchestration: Learn how CrewAI handles multi-agent collaboration from a different angle
- LangChain integration: If you need retrieval-augmented generation, explore building an agent with LangChain
- Understand the patterns: Read about ReAct reasoning to understand the decision loop your agents are using
The OpenAI Agents SDK is designed to grow with your use case — from single-agent prototypes to enterprise-scale multi-agent pipelines with thousands of concurrent runs.
Frequently Asked Questions#
Does the OpenAI Agents SDK work with models other than GPT-4?
Yes. You can use any model supported by the OpenAI API, including gpt-4o-mini for cost efficiency and o1 or o3 for reasoning-intensive tasks. You can mix models across agents in the same pipeline.
How does the Agents SDK handle errors in tool calls?
By default, if a tool raises an exception, the error message is passed back to the model as a tool result so the agent can attempt recovery. You can customize this behavior using ToolCallOutput and error handling callbacks in RunConfig.
Is the tracing system optional?
Tracing is enabled by default but only stores data if you have an active OpenAI API key. You can disable it by setting OPENAI_AGENTS_DISABLE_TRACING=1 or by passing trace_config=DisabledTracingConfig() to RunConfig.
Can I run the Agents SDK locally without calling OpenAI APIs?
You can point the SDK at any OpenAI-compatible API endpoint (like a local Ollama server) by setting the base_url parameter in the OpenAIChatCompletionsModel configuration. However, function calling reliability depends on model capability.
What is the difference between a tool and a handoff?
A tool returns a result back to the same agent for continued processing. A handoff transfers the entire conversation context to a new agent, which then runs independently until it returns a final response. Handoffs are one-directional by default unless you configure the receiving agent with a return handoff.