black pen on white paper — Photo by Javier Esteban on Unsplash

Build an AI Agent with the OpenAI Agents SDK

OpenAI's Agents SDK is the company's official framework for building production-grade AI agents. Released in early 2025, it brings together tools, handoffs, guardrails, and tracing into a single cohesive Python library — backed directly by OpenAI's own engineering team.

In this tutorial you will build a fully functional research assistant agent that can search the web, summarize findings, and hand off to a specialized writer agent to produce final reports. By the end you will understand the core primitives of the SDK: Agents, Tools, Handoffs, and the Runner that executes them.

What You'll Learn#

How to install and configure the OpenAI Agents SDK
How to define tools using the @function_tool decorator
How to create multiple agents with distinct roles and system prompts
How to orchestrate handoffs between agents using the handoff() primitive
How to use the built-in tracing system to debug and inspect agent runs

Prerequisites#

Python 3.10 or higher installed
An OpenAI API key (set as OPENAI_API_KEY environment variable)
Basic understanding of AI agents and how they use tool use
Familiarity with async Python (we use asyncio)

Step 1: Project Setup#

Start by creating a virtual environment and installing the SDK. OpenAI recommends using uv for fast dependency management, but plain pip works equally well.

mkdir openai-agent-demo && cd openai-agent-demo
python -m venv .venv && source .venv/bin/activate

# Install the Agents SDK and supporting libraries
pip install openai-agents httpx python-dotenv

Create a .env file at the project root:

OPENAI_API_KEY=sk-...your-key-here...

Then create your main script file agent.py. At the top, load the environment variables before importing anything else:

import os
from dotenv import load_dotenv

load_dotenv()  # Load OPENAI_API_KEY from .env

Step 2: Define Your Tools#

Tools give agents the ability to take actions beyond text generation. In the OpenAI Agents SDK, you define tools by decorating regular Python functions with @function_tool. The SDK automatically extracts the docstring and type annotations to build the JSON schema that is sent to the model.

import httpx
from agents import function_tool

@function_tool
def search_web(query: str) -> str:
    """Search the web for recent information about a topic.

    Args:
        query: The search query string to look up.

    Returns:
        A plain-text summary of the top search results.
    """
    # In a real implementation, call a search API (Tavily, Brave, etc.)
    # For demonstration we use a mock response
    response = httpx.get(
        "https://api.duckduckgo.com/",
        params={"q": query, "format": "json", "no_html": 1},
        timeout=10,
    )
    data = response.json()
    abstract = data.get("AbstractText", "")
    if abstract:
        return abstract
    return f"No direct summary found for: {query}. Try refining the query."


@function_tool
def calculate(expression: str) -> str:
    """Evaluate a mathematical expression safely.

    Args:
        expression: A math expression like '(42 * 3) / 7 + 100'.

    Returns:
        The numeric result as a string.
    """
    try:
        # Restrict to safe evaluation — no builtins
        result = eval(expression, {"__builtins__": {}}, {})
        return str(result)
    except Exception as e:
        return f"Error evaluating expression: {e}"

Notice that the docstrings are important: the model reads them to understand when and how to call each tool.

Step 3: Create Specialized Agents#

The power of the Agents SDK is composing multiple agents with distinct roles. Here we build two: a ResearchAgent that gathers information, and a WriterAgent that transforms research into polished prose.

from agents import Agent

# Writer agent — receives research notes and formats them
writer_agent = Agent(
    name="WriterAgent",
    instructions="""You are an expert technical writer who specializes in clear,
    engaging explanations of complex topics. When you receive research notes,
    you transform them into well-structured, reader-friendly content.

    Structure your output with:
    - An engaging introduction
    - Clear numbered or bulleted sections
    - A concise conclusion with key takeaways

    Write in a professional but approachable tone. Avoid jargon unless necessary.""",
    model="gpt-4o",
)

# Research agent — the primary agent with tools and handoff capability
research_agent = Agent(
    name="ResearchAgent",
    instructions="""You are a research assistant. Your job is to:
    1. Search for accurate, up-to-date information using the search_web tool
    2. Use the calculate tool when numerical analysis is needed
    3. Compile comprehensive research notes
    4. Hand off to the WriterAgent when you have gathered enough information

    Always verify facts by searching multiple angles.
    When your research is complete, transfer to WriterAgent.""",
    model="gpt-4o",
    tools=[search_web, calculate],
    handoffs=[writer_agent],  # Declare which agents this agent can hand off to
)

The handoffs parameter tells the SDK that research_agent is allowed to transfer control to writer_agent. The SDK adds a special transfer_to_writeragent tool to the research agent's tool list automatically.

Python code on a developer's screen showing AI agent implementation patterns

Step 4: Run the Agent Pipeline#

The Runner class manages the execution loop. It handles the agentic loop — calling the model, executing tools, processing results, and deciding when to stop or hand off.

import asyncio
from agents import Runner

async def main():
    # The Runner manages the full agentic conversation
    result = await Runner.run(
        starting_agent=research_agent,
        input="Research the current state of AI agent frameworks in 2025 and 2026. "
              "Include information about OpenAI Agents SDK, LangChain, and CrewAI. "
              "Then write a concise 300-word report on the topic.",
    )

    # The final output is the last agent's text response
    print("=== Final Report ===")
    print(result.final_output)

    # You can also inspect which agents ran and what tools were called
    print("\n=== Agent Run Summary ===")
    for item in result.new_items:
        print(f"  [{item.type}] {getattr(item, 'agent', {}).name if hasattr(item, 'agent') else ''}")

if __name__ == "__main__":
    asyncio.run(main())

Run it:

python agent.py

Expected output will show the ResearchAgent calling search_web multiple times, then handing off to WriterAgent, which produces the final formatted report.

Step 5: Enabling Tracing#

One of the most useful features of the Agents SDK is built-in tracing. Every run is automatically recorded to OpenAI's trace storage, accessible in the OpenAI Platform dashboard. You can inspect every tool call, model response, and handoff in a visual timeline.

from agents import Runner, RunConfig, trace

async def main_with_tracing():
    # Wrap your run in a trace context to group related runs
    with trace("research-and-write-pipeline"):
        result = await Runner.run(
            starting_agent=research_agent,
            input="What are the key differences between ReAct and Plan-and-Execute agent patterns?",
            run_config=RunConfig(
                workflow_name="Research Report Generator",
                trace_metadata={"user_id": "demo-user", "session": "tutorial-1"},
            ),
        )

    print(result.final_output)

asyncio.run(main_with_tracing())

After running, visit platform.openai.com/traces to see a complete flame graph of your agent's execution — including latency for each step, token usage, and the full conversation history.

Step 6: Adding Guardrails#

Guardrails let you validate inputs and outputs to prevent misuse or unexpected behavior. Input guardrails run before the agent processes user messages; output guardrails run before responses are returned.

from agents import Agent, input_guardrail, output_guardrail, GuardrailFunctionOutput

@input_guardrail
async def block_harmful_requests(ctx, agent, input_text):
    """Block requests that ask for harmful content."""
    harmful_keywords = ["hack", "bypass", "jailbreak", "exploit"]
    if any(kw in input_text.lower() for kw in harmful_keywords):
        return GuardrailFunctionOutput(
            output_info={"blocked": True},
            tripwire_triggered=True,  # This stops the agent run
        )
    return GuardrailFunctionOutput(output_info={"blocked": False}, tripwire_triggered=False)


safe_research_agent = Agent(
    name="SafeResearchAgent",
    instructions="You are a helpful research assistant.",
    model="gpt-4o",
    tools=[search_web],
    input_guardrails=[block_harmful_requests],
)

What's Next#

You now have a solid foundation for building production agents with the OpenAI Agents SDK. Here are the best next steps:

Compare frameworks: Read the OpenAI Agents SDK vs LangChain comparison to understand when to choose each
Explore more tools: See the OpenAI Agents SDK directory entry for a full capabilities overview
Multi-agent orchestration: Learn how CrewAI handles multi-agent collaboration from a different angle
LangChain integration: If you need retrieval-augmented generation, explore building an agent with LangChain
Understand the patterns: Read about ReAct reasoning to understand the decision loop your agents are using

The OpenAI Agents SDK is designed to grow with your use case — from single-agent prototypes to enterprise-scale multi-agent pipelines with thousands of concurrent runs.

Frequently Asked Questions#

Does the OpenAI Agents SDK work with models other than GPT-4?

Yes. You can use any model supported by the OpenAI API, including gpt-4o-mini for cost efficiency and o1 or o3 for reasoning-intensive tasks. You can mix models across agents in the same pipeline.

How does the Agents SDK handle errors in tool calls?

By default, if a tool raises an exception, the error message is passed back to the model as a tool result so the agent can attempt recovery. You can customize this behavior using ToolCallOutput and error handling callbacks in RunConfig.

Is the tracing system optional?

Tracing is enabled by default but only stores data if you have an active OpenAI API key. You can disable it by setting OPENAI_AGENTS_DISABLE_TRACING=1 or by passing trace_config=DisabledTracingConfig() to RunConfig.

Can I run the Agents SDK locally without calling OpenAI APIs?

You can point the SDK at any OpenAI-compatible API endpoint (like a local Ollama server) by setting the base_url parameter in the OpenAIChatCompletionsModel configuration. However, function calling reliability depends on model capability.

What is the difference between a tool and a handoff?

A tool returns a result back to the same agent for continued processing. A handoff transfers the entire conversation context to a new agent, which then runs independently until it returns a final response. Handoffs are one-directional by default unless you configure the receiving agent with a return handoff.

Build an AI Agent with the OpenAI Agents SDK

What You'll Learn#

How to install and configure the OpenAI Agents SDK
How to define tools using the @function_tool decorator
How to create multiple agents with distinct roles and system prompts
How to orchestrate handoffs between agents using the handoff() primitive
How to use the built-in tracing system to debug and inspect agent runs

Prerequisites#

Python 3.10 or higher installed
An OpenAI API key (set as OPENAI_API_KEY environment variable)
Basic understanding of AI agents and how they use tool use
Familiarity with async Python (we use asyncio)

Step 1: Project Setup#

Start by creating a virtual environment and installing the SDK. OpenAI recommends using uv for fast dependency management, but plain pip works equally well.

mkdir openai-agent-demo && cd openai-agent-demo
python -m venv .venv && source .venv/bin/activate

# Install the Agents SDK and supporting libraries
pip install openai-agents httpx python-dotenv

Create a .env file at the project root:

OPENAI_API_KEY=sk-...your-key-here...

Then create your main script file agent.py. At the top, load the environment variables before importing anything else:

import os
from dotenv import load_dotenv

load_dotenv()  # Load OPENAI_API_KEY from .env

Step 2: Define Your Tools#

import httpx
from agents import function_tool

@function_tool
def search_web(query: str) -> str:
    """Search the web for recent information about a topic.

    Args:
        query: The search query string to look up.

    Returns:
        A plain-text summary of the top search results.
    """
    # In a real implementation, call a search API (Tavily, Brave, etc.)
    # For demonstration we use a mock response
    response = httpx.get(
        "https://api.duckduckgo.com/",
        params={"q": query, "format": "json", "no_html": 1},
        timeout=10,
    )
    data = response.json()
    abstract = data.get("AbstractText", "")
    if abstract:
        return abstract
    return f"No direct summary found for: {query}. Try refining the query."


@function_tool
def calculate(expression: str) -> str:
    """Evaluate a mathematical expression safely.

    Args:
        expression: A math expression like '(42 * 3) / 7 + 100'.

    Returns:
        The numeric result as a string.
    """
    try:
        # Restrict to safe evaluation — no builtins
        result = eval(expression, {"__builtins__": {}}, {})
        return str(result)
    except Exception as e:
        return f"Error evaluating expression: {e}"

Notice that the docstrings are important: the model reads them to understand when and how to call each tool.

Step 3: Create Specialized Agents#

from agents import Agent

# Writer agent — receives research notes and formats them
writer_agent = Agent(
    name="WriterAgent",
    instructions="""You are an expert technical writer who specializes in clear,
    engaging explanations of complex topics. When you receive research notes,
    you transform them into well-structured, reader-friendly content.

    Structure your output with:
    - An engaging introduction
    - Clear numbered or bulleted sections
    - A concise conclusion with key takeaways

    Write in a professional but approachable tone. Avoid jargon unless necessary.""",
    model="gpt-4o",
)

# Research agent — the primary agent with tools and handoff capability
research_agent = Agent(
    name="ResearchAgent",
    instructions="""You are a research assistant. Your job is to:
    1. Search for accurate, up-to-date information using the search_web tool
    2. Use the calculate tool when numerical analysis is needed
    3. Compile comprehensive research notes
    4. Hand off to the WriterAgent when you have gathered enough information

    Always verify facts by searching multiple angles.
    When your research is complete, transfer to WriterAgent.""",
    model="gpt-4o",
    tools=[search_web, calculate],
    handoffs=[writer_agent],  # Declare which agents this agent can hand off to
)

Python code on a developer's screen showing AI agent implementation patterns

Step 4: Run the Agent Pipeline#

The Runner class manages the execution loop. It handles the agentic loop — calling the model, executing tools, processing results, and deciding when to stop or hand off.

import asyncio
from agents import Runner

async def main():
    # The Runner manages the full agentic conversation
    result = await Runner.run(
        starting_agent=research_agent,
        input="Research the current state of AI agent frameworks in 2025 and 2026. "
              "Include information about OpenAI Agents SDK, LangChain, and CrewAI. "
              "Then write a concise 300-word report on the topic.",
    )

    # The final output is the last agent's text response
    print("=== Final Report ===")
    print(result.final_output)

    # You can also inspect which agents ran and what tools were called
    print("\n=== Agent Run Summary ===")
    for item in result.new_items:
        print(f"  [{item.type}] {getattr(item, 'agent', {}).name if hasattr(item, 'agent') else ''}")

if __name__ == "__main__":
    asyncio.run(main())

Run it:

python agent.py

Expected output will show the ResearchAgent calling search_web multiple times, then handing off to WriterAgent, which produces the final formatted report.

Step 5: Enabling Tracing#

from agents import Runner, RunConfig, trace

async def main_with_tracing():
    # Wrap your run in a trace context to group related runs
    with trace("research-and-write-pipeline"):
        result = await Runner.run(
            starting_agent=research_agent,
            input="What are the key differences between ReAct and Plan-and-Execute agent patterns?",
            run_config=RunConfig(
                workflow_name="Research Report Generator",
                trace_metadata={"user_id": "demo-user", "session": "tutorial-1"},
            ),
        )

    print(result.final_output)

asyncio.run(main_with_tracing())

After running, visit platform.openai.com/traces to see a complete flame graph of your agent's execution — including latency for each step, token usage, and the full conversation history.

Step 6: Adding Guardrails#

from agents import Agent, input_guardrail, output_guardrail, GuardrailFunctionOutput

@input_guardrail
async def block_harmful_requests(ctx, agent, input_text):
    """Block requests that ask for harmful content."""
    harmful_keywords = ["hack", "bypass", "jailbreak", "exploit"]
    if any(kw in input_text.lower() for kw in harmful_keywords):
        return GuardrailFunctionOutput(
            output_info={"blocked": True},
            tripwire_triggered=True,  # This stops the agent run
        )
    return GuardrailFunctionOutput(output_info={"blocked": False}, tripwire_triggered=False)


safe_research_agent = Agent(
    name="SafeResearchAgent",
    instructions="You are a helpful research assistant.",
    model="gpt-4o",
    tools=[search_web],
    input_guardrails=[block_harmful_requests],
)

What's Next#

You now have a solid foundation for building production agents with the OpenAI Agents SDK. Here are the best next steps:

Compare frameworks: Read the OpenAI Agents SDK vs LangChain comparison to understand when to choose each
Explore more tools: See the OpenAI Agents SDK directory entry for a full capabilities overview
Multi-agent orchestration: Learn how CrewAI handles multi-agent collaboration from a different angle
LangChain integration: If you need retrieval-augmented generation, explore building an agent with LangChain
Understand the patterns: Read about ReAct reasoning to understand the decision loop your agents are using

The OpenAI Agents SDK is designed to grow with your use case — from single-agent prototypes to enterprise-scale multi-agent pipelines with thousands of concurrent runs.

Frequently Asked Questions#

Does the OpenAI Agents SDK work with models other than GPT-4?

How does the Agents SDK handle errors in tool calls?

Is the tracing system optional?

Can I run the Agents SDK locally without calling OpenAI APIs?

What is the difference between a tool and a handoff?

Build an AI Agent with OpenAI Agents SDK

Build an AI Agent with the OpenAI Agents SDK

What You'll Learn#

Prerequisites#

Step 1: Project Setup#

Step 2: Define Your Tools#

Step 3: Create Specialized Agents#

Step 4: Run the Agent Pipeline#

Step 5: Enabling Tracing#

Step 6: Adding Guardrails#

What's Next#

Frequently Asked Questions#

Build an AI Agent with OpenAI Agents SDK

Build an AI Agent with the OpenAI Agents SDK

What You'll Learn#

Prerequisites#

Step 1: Project Setup#

Step 2: Define Your Tools#

Step 3: Create Specialized Agents#

Step 4: Run the Agent Pipeline#

Step 5: Enabling Tracing#

Step 6: Adding Guardrails#

What's Next#

Frequently Asked Questions#