How does self-reflection improve agent performance?

Self-reflection helps agents catch reasoning errors before committing to actions, identify when an approach isn't working and pivot, verify that outputs meet quality criteria, and learn from mistakes within a single task execution.

What is the difference between self-reflection and self-correction?

Self-reflection is the evaluation process — the agent assessing its own reasoning or output. Self-correction is the action that follows — the agent modifying its approach based on that evaluation. Reflection enables correction.

Person taking notes and analyzing information during reflection — Photo by Avel Chuklanov on Unsplash

What Is Agent Self-Reflection?

Q: What is agent self-reflection?

Agent self-reflection is the capability of an AI agent to evaluate its own reasoning, outputs, or progress and identify improvements or errors. The agent acts as both producer and critic, iteratively refining its approach.

Quick Definition#

Agent self-reflection is the ability of an AI agent to evaluate and critique its own outputs or reasoning, identify weaknesses or errors, and revise before producing a final result. Rather than accepting the first output from an LLM call, a self-reflective agent runs an internal review loop — generating a draft, critiquing it against explicit criteria, and refining based on its own assessment. This reduce errors that would otherwise reach users or propagate through downstream workflow steps.

Browse all AI agent terms in the AI Agent Glossary. For multi-branch reasoning that explores alternatives before committing, see Tree of Thought. For action-based reasoning with tool use, see ReAct (Reasoning + Acting).

Why Self-Reflection Matters#

A single LLM call produces output probabilistically — the model's first attempt is often good but not optimal. Common failure modes self-reflection catches:

Factual errors: The initial response contains an incorrect date, statistic, or technical claim
Incomplete coverage: The answer misses important aspects of the question
Logical inconsistency: The reasoning contains a contradiction that is not obvious from the prompt
Code bugs: Generated code has an off-by-one error or edge case not handled
Format violations: The output does not match the requested schema or structure

Without self-reflection, these errors reach the user or become inputs to the next step in a multi-agent pipeline, where they compound. With self-reflection, the agent catches and corrects them before the output leaves its control.

Self-Reflection Patterns#

Pattern 1: Simple Self-Critique#

The most basic pattern: the agent generates an output, then critiques it against explicit criteria:

from anthropic import Anthropic

client = Anthropic()

def generate_with_reflection(task: str, criteria: list[str], max_revisions: int = 2) -> str:
    """Generate output, critique against criteria, and revise."""

    # Step 1: Generate initial draft
    draft_response = client.messages.create(
        model="claude-opus-4-6",
        max_tokens=1500,
        messages=[{"role": "user", "content": task}]
    )
    draft = draft_response.content[0].text

    for revision in range(max_revisions):
        # Step 2: Critique the draft
        criteria_text = "\n".join(f"- {c}" for c in criteria)
        critique_response = client.messages.create(
            model="claude-opus-4-6",
            max_tokens=800,
            messages=[{
                "role": "user",
                "content": f"""Review this output against the following criteria:

{criteria_text}

Output to review:
{draft}

Identify specific issues. If the output satisfies all criteria, respond with "APPROVED".
Otherwise, list the specific problems found."""
            }]
        )
        critique = critique_response.content[0].text

        if "APPROVED" in critique.upper():
            break  # Output meets criteria — stop revising

        # Step 3: Revise based on critique
        revision_response = client.messages.create(
            model="claude-opus-4-6",
            max_tokens=1500,
            messages=[{
                "role": "user",
                "content": f"""Original task: {task}

Your previous output:
{draft}

Critique of your output:
{critique}

Please revise your output to address all the identified issues."""
            }]
        )
        draft = revision_response.content[0].text

    return draft

# Usage
result = generate_with_reflection(
    task="Write a Python function that validates email addresses",
    criteria=[
        "Handles edge cases (empty string, None input)",
        "Includes docstring explaining behavior",
        "Does not use external libraries",
        "Returns True/False consistently"
    ]
)

Pattern 2: Reflexion (Memory-Based Reflection)#

The Reflexion framework (Shinn et al., 2023) extends self-critique by storing reflections in memory to guide future attempts:

class ReflexionAgent:
    def __init__(self, task: str):
        self.task = task
        self.reflections = []  # Memory of past failures and learnings
        self.client = Anthropic()

    def run(self, max_attempts: int = 3) -> str:
        for attempt in range(max_attempts):
            # Generate with context from past reflections
            reflection_context = "\n".join(self.reflections) if self.reflections else "No prior attempts."

            response = self.client.messages.create(
                model="claude-opus-4-6",
                max_tokens=2000,
                messages=[{
                    "role": "user",
                    "content": f"""Task: {self.task}

Lessons from previous attempts:
{reflection_context}

Complete the task, incorporating lessons learned."""
                }]
            )
            output = response.content[0].text

            # Evaluate the output
            is_success, feedback = self._evaluate(output)
            if is_success:
                return output

            # Generate reflection for memory
            reflection = self._reflect(output, feedback)
            self.reflections.append(f"Attempt {attempt + 1}: {reflection}")

        return output  # Return best attempt

    def _evaluate(self, output: str) -> tuple[bool, str]:
        """Evaluate output quality. Returns (success, feedback)."""
        # Custom evaluation logic here
        pass

    def _reflect(self, output: str, feedback: str) -> str:
        """Generate a concise lesson from the failed attempt."""
        response = self.client.messages.create(
            model="claude-opus-4-6",
            max_tokens=300,
            messages=[{
                "role": "user",
                "content": f"""Given this failed attempt and feedback, write one concise lesson to remember:

Output: {output}
Feedback: {feedback}

Lesson (1-2 sentences):"""
            }]
        )
        return response.content[0].text

Pattern 3: Constitutional Self-Critique#

Evaluate outputs against a set of principles rather than task-specific criteria:

AGENT_CONSTITUTION = [
    "Be factually accurate — do not state things you are uncertain about as fact",
    "Be helpful — provide actionable, specific information rather than vague advice",
    "Be honest — acknowledge limitations and uncertainties explicitly",
    "Be safe — do not provide information that could cause harm"
]

def constitutional_check(output: str) -> tuple[bool, str]:
    """Check output against constitutional principles."""
    principles_text = "\n".join(f"{i+1}. {p}" for i, p in enumerate(AGENT_CONSTITUTION))

    response = client.messages.create(
        model="claude-opus-4-6",
        max_tokens=500,
        messages=[{
            "role": "user",
            "content": f"""Evaluate whether this output violates any of the following principles:

{principles_text}

Output to evaluate:
{output}

Does this output violate any principles? If yes, identify which ones and explain.
If no violations, respond with "COMPLIANT"."""
        }]
    )
    critique = response.content[0].text
    return "COMPLIANT" in critique.upper(), critique

When Self-Reflection Helps vs Hurts#

Self-reflection helps:#

Code generation: Catching bugs, missing error handling, and edge cases before running
Factual research: Identifying unsupported claims or logical gaps
Structured data extraction: Ensuring outputs match required schemas
Multi-step agent pipelines: Preventing errors from compounding in downstream steps

Self-reflection may hurt:#

Creative tasks: Critique criteria applied to creative writing can make outputs more generic and less distinctive
Conversational responses: Over-refinement produces stilted, less natural dialogue
Time-sensitive tasks: Each critique-revise cycle adds latency; for real-time applications this may be unacceptable
Cost-sensitive contexts: Multiple LLM calls for each output increases per-query costs proportionally

Self-Reflection vs External Evaluation#

Dimension	Self-Reflection	External Evaluator
Evaluation source	Same model as generator	Separate model or system
Setup cost	Low (just prompting)	Higher (separate agent/model)
Evaluation quality	Limited by same model biases	Independent perspective
Speed	Faster (single model)	Slower (second model call)
Best for	General quality improvement	High-stakes verification

Common Misconceptions#

Misconception: Self-reflection always improves output quality Self-reflection improves measurable quality for tasks with clear correctness criteria. For tasks where quality is subjective, critique cycles often make outputs more conservative and generic. Match the use of self-reflection to tasks where evaluation criteria are explicit.

Misconception: Self-reflection eliminates hallucinations Self-reflection reduces certain types of errors but cannot eliminate hallucinations — if the model's initial generation contains a confident factual error, the same model critiquing that output may not catch it. For hallucination reduction, external verification against authoritative sources (CRITIC, tool-grounded evaluation) is more reliable.

Misconception: More revision cycles produce better results Diminishing returns set in quickly. Research suggests that one or two critique-revise cycles capture most of the quality improvement. Beyond that, additional cycles add cost and latency with minimal gains — and can actually introduce new errors through over-editing.

Tree of Thought — Explores multiple reasoning paths before committing, complementary to self-reflection
ReAct (Reasoning + Acting) — The action-based reasoning pattern self-reflection can enhance
Agent Planning — Planning agents benefit from self-reflection before executing plans
Agent Loop — Self-reflection adds inner loops within the agent's outer execution loop
Inner Monologue — The internal reasoning chain that self-reflection examines
Build Your First AI Agent — Tutorial including agent reasoning and quality improvement patterns
LangChain vs AutoGen — Comparing framework support for self-reflection and iterative reasoning

Frequently Asked Questions#

What is agent self-reflection in AI?#

Agent self-reflection is a reasoning pattern where an AI agent evaluates its own draft outputs, identifies errors or gaps, and revises accordingly. The agent acts as both producer and critic — generating an initial response, then using a critique prompt to find issues, then revising. This internal review loop catches mistakes before they reach users or propagate through multi-step pipelines.

How does agent self-reflection work?#

Self-reflection typically involves at least two LLM calls: a generation call producing an initial output, and an evaluation call that critiques that output against specific criteria. The critique is fed back with instructions to revise. This generate-critique-revise cycle can repeat multiple times. More sophisticated implementations use separate evaluator agents or tool-grounded verification to make critique more reliable.

What are the main self-reflection patterns?#

Key patterns include Reflexion (stores verbal reflections in memory to guide future attempts), Self-Critique (prompts the agent to find flaws in its own output), Constitutional AI (evaluates outputs against a set of principles), and CRITIC (uses external tool verification to ground evaluation in factual checks).

Does self-reflection always improve output quality?#

Self-reflection improves quality for tasks with clear correctness criteria — code generation, factual research, structured extraction. For subjective tasks like creative writing, critique cycles can reduce diversity and make outputs more generic. Effectiveness depends on evaluation prompt quality, and each cycle adds proportional cost and latency.

What Is Agent Self-Reflection?

Quick Definition#

Why Self-Reflection Matters#

A single LLM call produces output probabilistically — the model's first attempt is often good but not optimal. Common failure modes self-reflection catches:

Factual errors: The initial response contains an incorrect date, statistic, or technical claim
Incomplete coverage: The answer misses important aspects of the question
Logical inconsistency: The reasoning contains a contradiction that is not obvious from the prompt
Code bugs: Generated code has an off-by-one error or edge case not handled
Format violations: The output does not match the requested schema or structure

Self-Reflection Patterns#

Pattern 1: Simple Self-Critique#

The most basic pattern: the agent generates an output, then critiques it against explicit criteria:

from anthropic import Anthropic

client = Anthropic()

def generate_with_reflection(task: str, criteria: list[str], max_revisions: int = 2) -> str:
    """Generate output, critique against criteria, and revise."""

    # Step 1: Generate initial draft
    draft_response = client.messages.create(
        model="claude-opus-4-6",
        max_tokens=1500,
        messages=[{"role": "user", "content": task}]
    )
    draft = draft_response.content[0].text

    for revision in range(max_revisions):
        # Step 2: Critique the draft
        criteria_text = "\n".join(f"- {c}" for c in criteria)
        critique_response = client.messages.create(
            model="claude-opus-4-6",
            max_tokens=800,
            messages=[{
                "role": "user",
                "content": f"""Review this output against the following criteria:

{criteria_text}

Output to review:
{draft}

Identify specific issues. If the output satisfies all criteria, respond with "APPROVED".
Otherwise, list the specific problems found."""
            }]
        )
        critique = critique_response.content[0].text

        if "APPROVED" in critique.upper():
            break  # Output meets criteria — stop revising

        # Step 3: Revise based on critique
        revision_response = client.messages.create(
            model="claude-opus-4-6",
            max_tokens=1500,
            messages=[{
                "role": "user",
                "content": f"""Original task: {task}

Your previous output:
{draft}

Critique of your output:
{critique}

Please revise your output to address all the identified issues."""
            }]
        )
        draft = revision_response.content[0].text

    return draft

# Usage
result = generate_with_reflection(
    task="Write a Python function that validates email addresses",
    criteria=[
        "Handles edge cases (empty string, None input)",
        "Includes docstring explaining behavior",
        "Does not use external libraries",
        "Returns True/False consistently"
    ]
)

Pattern 2: Reflexion (Memory-Based Reflection)#

The Reflexion framework (Shinn et al., 2023) extends self-critique by storing reflections in memory to guide future attempts:

class ReflexionAgent:
    def __init__(self, task: str):
        self.task = task
        self.reflections = []  # Memory of past failures and learnings
        self.client = Anthropic()

    def run(self, max_attempts: int = 3) -> str:
        for attempt in range(max_attempts):
            # Generate with context from past reflections
            reflection_context = "\n".join(self.reflections) if self.reflections else "No prior attempts."

            response = self.client.messages.create(
                model="claude-opus-4-6",
                max_tokens=2000,
                messages=[{
                    "role": "user",
                    "content": f"""Task: {self.task}

Lessons from previous attempts:
{reflection_context}

Complete the task, incorporating lessons learned."""
                }]
            )
            output = response.content[0].text

            # Evaluate the output
            is_success, feedback = self._evaluate(output)
            if is_success:
                return output

            # Generate reflection for memory
            reflection = self._reflect(output, feedback)
            self.reflections.append(f"Attempt {attempt + 1}: {reflection}")

        return output  # Return best attempt

    def _evaluate(self, output: str) -> tuple[bool, str]:
        """Evaluate output quality. Returns (success, feedback)."""
        # Custom evaluation logic here
        pass

    def _reflect(self, output: str, feedback: str) -> str:
        """Generate a concise lesson from the failed attempt."""
        response = self.client.messages.create(
            model="claude-opus-4-6",
            max_tokens=300,
            messages=[{
                "role": "user",
                "content": f"""Given this failed attempt and feedback, write one concise lesson to remember:

Output: {output}
Feedback: {feedback}

Lesson (1-2 sentences):"""
            }]
        )
        return response.content[0].text

Pattern 3: Constitutional Self-Critique#

Evaluate outputs against a set of principles rather than task-specific criteria:

AGENT_CONSTITUTION = [
    "Be factually accurate — do not state things you are uncertain about as fact",
    "Be helpful — provide actionable, specific information rather than vague advice",
    "Be honest — acknowledge limitations and uncertainties explicitly",
    "Be safe — do not provide information that could cause harm"
]

def constitutional_check(output: str) -> tuple[bool, str]:
    """Check output against constitutional principles."""
    principles_text = "\n".join(f"{i+1}. {p}" for i, p in enumerate(AGENT_CONSTITUTION))

    response = client.messages.create(
        model="claude-opus-4-6",
        max_tokens=500,
        messages=[{
            "role": "user",
            "content": f"""Evaluate whether this output violates any of the following principles:

{principles_text}

Output to evaluate:
{output}

Does this output violate any principles? If yes, identify which ones and explain.
If no violations, respond with "COMPLIANT"."""
        }]
    )
    critique = response.content[0].text
    return "COMPLIANT" in critique.upper(), critique

When Self-Reflection Helps vs Hurts#

Self-reflection helps:#

Code generation: Catching bugs, missing error handling, and edge cases before running
Factual research: Identifying unsupported claims or logical gaps
Structured data extraction: Ensuring outputs match required schemas
Multi-step agent pipelines: Preventing errors from compounding in downstream steps

Self-reflection may hurt:#

Creative tasks: Critique criteria applied to creative writing can make outputs more generic and less distinctive
Conversational responses: Over-refinement produces stilted, less natural dialogue
Time-sensitive tasks: Each critique-revise cycle adds latency; for real-time applications this may be unacceptable
Cost-sensitive contexts: Multiple LLM calls for each output increases per-query costs proportionally

Self-Reflection vs External Evaluation#

Dimension	Self-Reflection	External Evaluator
Evaluation source	Same model as generator	Separate model or system
Setup cost	Low (just prompting)	Higher (separate agent/model)
Evaluation quality	Limited by same model biases	Independent perspective
Speed	Faster (single model)	Slower (second model call)
Best for	General quality improvement	High-stakes verification

Common Misconceptions#

Tree of Thought — Explores multiple reasoning paths before committing, complementary to self-reflection
ReAct (Reasoning + Acting) — The action-based reasoning pattern self-reflection can enhance
Agent Planning — Planning agents benefit from self-reflection before executing plans
Agent Loop — Self-reflection adds inner loops within the agent's outer execution loop
Inner Monologue — The internal reasoning chain that self-reflection examines
Build Your First AI Agent — Tutorial including agent reasoning and quality improvement patterns
LangChain vs AutoGen — Comparing framework support for self-reflection and iterative reasoning

What Is Agent Self-Reflection?

Term Snapshot

What Is Agent Self-Reflection?

Quick Definition#

Why Self-Reflection Matters#

Self-Reflection Patterns#

Pattern 1: Simple Self-Critique#

Pattern 2: Reflexion (Memory-Based Reflection)#

Pattern 3: Constitutional Self-Critique#

When Self-Reflection Helps vs Hurts#

Self-reflection helps:#

Self-reflection may hurt:#

Self-Reflection vs External Evaluation#

Common Misconceptions#

Frequently Asked Questions#

What is agent self-reflection in AI?#

How does agent self-reflection work?#

What are the main self-reflection patterns?#

Does self-reflection always improve output quality?#

What Is Agent Self-Reflection?

Term Snapshot

What Is Agent Self-Reflection?

Quick Definition#

Why Self-Reflection Matters#

Self-Reflection Patterns#

Pattern 1: Simple Self-Critique#

Pattern 2: Reflexion (Memory-Based Reflection)#

Pattern 3: Constitutional Self-Critique#

When Self-Reflection Helps vs Hurts#

Self-reflection helps:#

Self-reflection may hurt:#

Self-Reflection vs External Evaluation#

Common Misconceptions#

Frequently Asked Questions#

What is agent self-reflection in AI?#

How does agent self-reflection work?#

What are the main self-reflection patterns?#

Does self-reflection always improve output quality?#

Term Snapshot

What Is Agent Self-Reflection?

Quick Definition#

Why Self-Reflection Matters#

Self-Reflection Patterns#

Pattern 1: Simple Self-Critique#

Pattern 2: Reflexion (Memory-Based Reflection)#

Pattern 3: Constitutional Self-Critique#

When Self-Reflection Helps vs Hurts#

Self-reflection helps:#

Self-reflection may hurt:#

Self-Reflection vs External Evaluation#

Common Misconceptions#

Related Terms#

Frequently Asked Questions#

What is agent self-reflection in AI?#

How does agent self-reflection work?#

What are the main self-reflection patterns?#

Does self-reflection always improve output quality?#

Term Snapshot

What Is Agent Self-Reflection?

Quick Definition#

Why Self-Reflection Matters#

Self-Reflection Patterns#

Pattern 1: Simple Self-Critique#

Pattern 2: Reflexion (Memory-Based Reflection)#

Pattern 3: Constitutional Self-Critique#

When Self-Reflection Helps vs Hurts#

Self-reflection helps:#

Self-reflection may hurt:#

Self-Reflection vs External Evaluation#

Common Misconceptions#

Related Terms#

Frequently Asked Questions#

What is agent self-reflection in AI?#

How does agent self-reflection work?#

What are the main self-reflection patterns?#

Does self-reflection always improve output quality?#