LangGraph has rapidly become the default framework for production multi-agent systems within the Python ecosystem. Where LangChain's sequential chains feel constrained for complex agent workflows, LangGraph's graph model provides the expressiveness needed for stateful agents, cyclical reasoning loops, and multi-agent coordination. It's also the most opinionated about production concerns — checkpointing, streaming, and human oversight are first-class features, not afterthoughts.
This review provides a technical assessment of LangGraph's architecture, its strengths in production agentic systems, and the real complexity cost teams take on when adopting it.
What LangGraph Actually Is#
LangGraph is a Python (and TypeScript) library for building stateful, multi-actor agentic applications. It represents agent workflows as directed graphs where:
- Nodes are Python functions that receive state, perform computation, and return updated state
- Edges determine which node executes next — either unconditionally or based on conditional logic
- State is a typed dictionary shared across all nodes in the graph
- Checkpointers optionally persist state between executions (enabling resumable agents)
The graph model unlocks patterns that sequential chains cannot express: cycles (an agent loop that continues until a condition is met), conditional branching (different paths based on tool results), and parallel execution (fan-out to multiple agents, then fan-in to aggregate results).
Core Architecture: StateGraph#
from langgraph.graph import StateGraph, END
from langgraph.checkpoint.memory import MemorySaver
from anthropic import Anthropic
from typing import TypedDict, Annotated
import operator
# Define the state schema
class AgentState(TypedDict):
messages: Annotated[list, operator.add] # reducer: messages accumulate
tool_calls_remaining: int
final_answer: str
# Initialize
client = Anthropic()
def agent_node(state: AgentState) -> AgentState:
"""LLM reasoning node — decides next action."""
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
system="You are a helpful research assistant with web search access.",
messages=[{"role": m["role"], "content": m["content"]}
for m in state["messages"]]
)
return {
"messages": [{"role": "assistant", "content": response.content[0].text}],
"tool_calls_remaining": state["tool_calls_remaining"] - 1
}
def tool_node(state: AgentState) -> AgentState:
"""Execute tool calls from the agent's response."""
# ... tool execution logic
return {"messages": [{"role": "tool", "content": "search results..."}]}
def should_continue(state: AgentState) -> str:
"""Conditional edge: continue looping or end?"""
last_message = state["messages"][-1]
if state["tool_calls_remaining"] <= 0:
return "end"
if "FINAL ANSWER:" in last_message.get("content", ""):
return "end"
return "tools"
# Build the graph
workflow = StateGraph(AgentState)
workflow.add_node("agent", agent_node)
workflow.add_node("tools", tool_node)
workflow.set_entry_point("agent")
# Conditional edge: agent → tools OR end
workflow.add_conditional_edges(
"agent",
should_continue,
{"tools": "tools", "end": END}
)
workflow.add_edge("tools", "agent") # Always return to agent after tools
# Compile with checkpointing for state persistence
memory = MemorySaver()
app = workflow.compile(checkpointer=memory)
# Run the agent
config = {"configurable": {"thread_id": "session_001"}}
result = app.invoke(
{"messages": [{"role": "user", "content": "Research AI agent trends in 2026"}],
"tool_calls_remaining": 5,
"final_answer": ""},
config=config
)
Human-in-the-Loop with Checkpointing#
LangGraph's checkpointing + interrupt mechanism is its strongest production feature:
from langgraph.graph import StateGraph, END
from langgraph.checkpoint.postgres import PostgresSaver
# Production checkpointer — persists to PostgreSQL
with PostgresSaver.from_conn_string("postgresql://user:pass@localhost/agentdb") as checkpointer:
app = workflow.compile(
checkpointer=checkpointer,
interrupt_before=["high_risk_action"] # Pause here for human review
)
# First run — will pause at high_risk_action node
config = {"configurable": {"thread_id": "task_001"}}
state = app.invoke(initial_state, config=config)
# State is paused — agent is waiting
# Human can inspect state: app.get_state(config)
# After human approval, resume from where it paused
app.invoke(None, config=config) # None = resume without new input
This pattern enables compliance-sensitive workflows where high-stakes actions (sending emails, making payments, executing code) require human sign-off before proceeding — without losing agent state between the pause and resume.
Multi-Agent Architecture#
LangGraph supports multi-agent patterns through subgraphs:
from langgraph.graph import StateGraph, END
# Worker subgraph
def create_worker_graph(worker_role: str):
worker = StateGraph(AgentState)
worker.add_node("think", lambda s: think_node(s, role=worker_role))
worker.add_node("act", action_node)
worker.set_entry_point("think")
worker.add_edge("think", "act")
worker.add_edge("act", END)
return worker.compile()
# Supervisor graph
supervisor = StateGraph(AgentState)
supervisor.add_node("plan", supervisor_plan_node)
supervisor.add_node("researcher", create_worker_graph("researcher")) # Subgraph
supervisor.add_node("writer", create_worker_graph("writer")) # Subgraph
supervisor.add_node("synthesize", synthesize_node)
supervisor.set_entry_point("plan")
supervisor.add_edge("plan", "researcher")
supervisor.add_edge("plan", "writer") # Parallel execution
supervisor.add_edge("researcher", "synthesize")
supervisor.add_edge("writer", "synthesize")
supervisor.add_edge("synthesize", END)
multi_agent_app = supervisor.compile()
Streaming#
LangGraph's streaming is comprehensive — stream tokens, node outputs, or graph-level events:
# Stream all events (most verbose)
for event in app.stream(inputs, config):
for key, value in event.items():
print(f"Node: {key}")
print(f"Output: {value}")
# Stream LLM tokens only
for chunk in app.stream(inputs, config, stream_mode="messages"):
if chunk[1]["langgraph_node"] == "agent":
print(chunk[0].content, end="", flush=True)
Pricing Breakdown#
| Component | Cost |
|---|---|
| LangGraph (open-source) | Free |
| LangSmith Developer | Free (5K traces/month) |
| LangSmith Teams | $39/seat/month |
| LangGraph Cloud | Included with Teams tier |
| LangGraph Platform Enterprise | Custom pricing |
For a team of 5 engineers using LangSmith Teams: $195/month for full observability + deployment infrastructure. This is the realistic production cost beyond LLM API fees.
Pros#
Expressiveness: The graph model handles workflows that no sequential framework can represent cleanly — agents that loop, branch on conditions, execute in parallel, and checkpoint at specific points. For complex production agents, this expressiveness eliminates painful workarounds.
Production-first design: Checkpointing, human-in-the-loop, streaming, and LangSmith integration are built in, not bolted on. Teams building production agents don't start from scratch on these concerns.
Multi-agent support: Subgraphs, supervisor patterns, and the Send API for fan-out make multi-agent architectures a natural fit rather than a framework stretch.
Cons#
Verbosity: A simple two-node agent in LangGraph requires significantly more code than the equivalent in LangChain or direct API usage. The state schema, graph construction, edge definitions, and compilation ceremony add boilerplate that smaller projects don't need.
Learning curve: Graph concepts, reducers, conditional edge functions, and state management require time to internalize. Teams without prior graph/state-machine experience often find the first few days frustrating.
LangSmith pricing at scale: The LangSmith Teams tier at $39/seat/month adds up for larger engineering teams. At 20 engineers, that's $780/month just for observability.
Who Should Use LangGraph#
Strong fit:
- Teams building production agents with complex control flow
- Applications requiring human-in-the-loop review of agent actions
- Long-running agents that must survive failures and resume
- Multi-agent architectures needing a well-designed coordination layer
- Teams already invested in the LangChain ecosystem
Poor fit:
- Simple single-turn agents (LangChain or direct API is less overhead)
- Rapid prototyping where verbosity slows iteration
- Non-Python teams (TypeScript support is less mature)
- Teams evaluating options before committing to LangChain ecosystem lock-in
Verdict#
LangGraph earns a 4.4/5 rating. For production multi-agent systems, it's the most complete solution available in the Python ecosystem. The checkpointing, streaming, multi-agent support, and LangSmith integration form a production stack that would take months to build independently.
The verbosity and learning curve are real costs that must be weighed against the production benefits. For complex, long-running agents in production, LangGraph pays for its complexity cost many times over. For smaller or simpler agents, the overhead may not be justified.
Related Resources#
- LangGraph in the AI Agent Directory
- LangGraph Multi-Agent Tutorial
- LangChain vs AutoGen Comparison
- LangChain Review — Foundation framework
- Supervisor Agent Glossary Term — Pattern LangGraph implements
Frequently Asked Questions#
What is LangGraph and how is it different from LangChain?#
LangChain provides abstractions for LLM chains, retrievers, and tools. LangGraph adds a graph model for stateful agent workflows with cycles, conditional branching, and parallel execution. Use LangChain for sequential pipelines; use LangGraph for complex stateful agents.
When should I use LangGraph vs CrewAI?#
Use LangGraph for precise workflow control, cyclical patterns, checkpointing, and human-in-the-loop. Use CrewAI for faster development with opinionated role-based design. LangGraph gives more control; CrewAI gives faster iteration.
Does LangGraph support human-in-the-loop workflows?#
Yes — interrupt_before pauses execution at specified nodes, and checkpointing persists exact graph state. Agents can pause indefinitely for human review and resume without losing state.
What is LangGraph Cloud and do I need it?#
LangGraph Cloud provides managed deployment infrastructure. It's not required — LangGraph is open-source. Teams without DevOps capacity find it valuable; teams with infrastructure expertise can self-host.