AI Research Agent Examples: 6 Real Cases

Person taking notes while working on research with a laptop — Photo by Green Chameleon on Unsplash

Research is one of the most naturally agentic tasks: you start with a question, formulate a strategy, gather sources, evaluate them, find gaps, search more, and synthesize. An AI research agent does exactly this, but at machine speed and with systematic coverage that no individual human can match.

These six examples cover the major research agent patterns — from systematic literature reviews to real-time competitive intelligence — with working architectures and code. They're built for researchers, analysts, and product teams who need to answer complex questions faster than traditional research allows.

For the technical foundation of these patterns, Agentic RAG explains how agents combine retrieval and reasoning, and the Agentic RAG tutorial walks through building one from scratch.

Example 1: Academic Literature Review Agent#

Use Case: Given a research topic, systematically search academic databases, retrieve papers, extract key findings, identify research gaps, and produce a structured literature review.

Architecture: LangChain ReAct agent + Semantic Scholar API tool + arXiv API tool + PDF reader tool + structured synthesis.

Key Implementation:

from langchain_anthropic import ChatAnthropic
from langchain_core.tools import tool
from langchain.agents import create_react_agent, AgentExecutor
from langchain import hub
import httpx

@tool
def search_semantic_scholar(query: str, limit: int = 5) -> list[dict]:
    """Search Semantic Scholar for academic papers on a topic."""
    response = httpx.get(
        "https://api.semanticscholar.org/graph/v1/paper/search",
        params={
            "query": query,
            "fields": "title,abstract,year,authors,citationCount,openAccessPdf",
            "limit": limit
        }
    )
    papers = response.json().get("data", [])
    return [
        {
            "title": p["title"],
            "year": p.get("year"),
            "abstract": p.get("abstract", "")[:500],
            "citations": p.get("citationCount", 0),
            "open_access_url": p.get("openAccessPdf", {}).get("url") if p.get("openAccessPdf") else None
        }
        for p in papers
    ]

@tool
def search_arxiv(query: str, max_results: int = 5) -> list[dict]:
    """Search arXiv for preprints on a topic."""
    import xml.etree.ElementTree as ET
    response = httpx.get(
        "http://export.arxiv.org/api/query",
        params={"search_query": f"all:{query}", "max_results": max_results, "sortBy": "relevance"}
    )
    root = ET.fromstring(response.text)
    ns = {"atom": "http://www.w3.org/2005/Atom"}
    papers = []
    for entry in root.findall("atom:entry", ns):
        papers.append({
            "title": entry.find("atom:title", ns).text.strip(),
            "summary": entry.find("atom:summary", ns).text.strip()[:500],
            "published": entry.find("atom:published", ns).text[:10],
            "url": entry.find("atom:id", ns).text
        })
    return papers

@tool
def synthesize_findings(papers: list, research_question: str) -> str:
    """This tool is a signal for the agent to stop searching and begin synthesis."""
    return f"Ready to synthesize {len(papers)} papers for: {research_question}"

llm = ChatAnthropic(model="claude-3-5-sonnet-20241022")
tools = [search_semantic_scholar, search_arxiv, synthesize_findings]

agent = create_react_agent(llm=llm, tools=tools, prompt=hub.pull("hwchase17/react"))
executor = AgentExecutor(
    agent=agent,
    tools=tools,
    max_iterations=12,
    verbose=True
)

result = executor.invoke({
    "input": """Conduct a literature review on 'retrieval-augmented generation for code generation'.
    Find 8-10 key papers, identify the main approaches, current limitations, and open research questions.
    Produce a structured literature review with sections: Background, Key Approaches, Findings, Gaps."""
})
print(result["output"])

Outcome: A structured literature review covering recent papers, synthesizing main findings, and identifying research gaps — work that would take a researcher 4–8 hours done in under 10 minutes.

Example 2: Market Research Intelligence Agent#

Use Case: Research a target market's size, growth trends, key players, customer pain points, and regulatory environment to inform a product or market entry decision.

Architecture: Tavily web search + industry report sites + news search + structured synthesis agent.

Key Implementation:

from agents import Agent, Runner, function_tool
import httpx

@function_tool
async def search_market_data(query: str) -> str:
    """Search for market size data, growth rates, and industry analysis."""
    async with httpx.AsyncClient() as client:
        response = await client.post(
            "https://api.tavily.com/search",
            json={
                "query": query,
                "search_depth": "advanced",
                "include_domains": ["statista.com", "grandviewresearch.com",
                                   "marketsandmarkets.com", "gartner.com",
                                   "forrester.com", "mckinsey.com"],
                "max_results": 5
            },
            headers={"Authorization": f"Bearer {TAVILY_API_KEY}"}
        )
        results = response.json()["results"]
        return "\n\n".join([f"Source: {r['url']}\n{r['content'][:400]}" for r in results])

@function_tool
async def search_recent_news(topic: str) -> str:
    """Search for recent news and developments about a market or company."""
    async with httpx.AsyncClient() as client:
        response = await client.post(
            "https://api.tavily.com/search",
            json={"query": topic, "topic": "news", "days": 90, "max_results": 5},
            headers={"Authorization": f"Bearer {TAVILY_API_KEY}"}
        )
        results = response.json()["results"]
        return "\n".join([f"[{r['published_date']}] {r['title']}: {r['content'][:200]}" for r in results])

from pydantic import BaseModel
from typing import List

class MarketResearchReport(BaseModel):
    market_size_usd_billion: str
    cagr_percent: str
    key_players: List[str]
    customer_pain_points: List[str]
    market_trends: List[str]
    regulatory_considerations: str
    market_entry_attractiveness: str
    confidence_level: str

market_researcher = Agent(
    name="Market Research Analyst",
    model="gpt-4o",
    instructions="""You are a senior market research analyst. Research the specified market
    thoroughly using all available tools. Gather: market size, growth rate, key players,
    customer pain points, recent trends, and regulatory environment.
    Use multiple searches to ensure comprehensive coverage.
    Output a structured market research report.""",
    tools=[search_market_data, search_recent_news],
    output_type=MarketResearchReport
)

import asyncio
async def main():
    result = await Runner.run(
        market_researcher,
        input="Research the AI-powered customer service software market. Focus on SMB segment, North America and Europe."
    )
    report = result.final_output
    print(f"Market Size: ${report.market_size_usd_billion}B")
    print(f"Growth Rate: {report.cagr_percent}% CAGR")
    print(f"Key Players: {', '.join(report.key_players)}")

asyncio.run(main())

Outcome: A structured market research report with cited sources in 3–5 minutes. The output_type constraint ensures the report is always machine-parseable for downstream use in pitch decks or financial models.

Example 3: Patent Landscape Analysis Agent#

Use Case: Map the patent landscape for a technology area to identify key IP holders, recent filings, white spaces, and potential freedom-to-operate risks.

Architecture: USPTO API + Google Patents search + AI analyst for pattern detection.

Key Implementation:

from openai import OpenAI
import httpx

client = OpenAI()

def search_patents(query: str, date_from: str = "2020-01-01") -> list[dict]:
    """Search USPTO patents via PatentsView API."""
    response = httpx.post(
        "https://search.patentsview.org/api/v1/patent/",
        json={
            "q": {"_text_any": {"patent_abstract": query}},
            "f": ["patent_number", "patent_title", "patent_abstract",
                  "assignee_organization", "patent_date"],
            "o": {"patent_date": "desc"},
            "s": [{"patent_date": "desc"}]
        }
    )
    data = response.json()
    return data.get("patents", [])[:20]

def analyze_patent_landscape(technology: str, patents: list[dict]) -> str:
    """AI analysis of patent landscape patterns."""
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {
                "role": "system",
                "content": """You are a patent analyst and IP strategist.
                Analyze the provided patent data and identify:
                1. Top assignees by patent count and their strategic focus
                2. Key technical approaches being patented
                3. Temporal trends (increasing/decreasing filings)
                4. White spaces (what's NOT being patented)
                5. Freedom-to-operate considerations for new entrants
                6. Licensing opportunity assessment
                Provide a structured IP landscape report."""
            },
            {
                "role": "user",
                "content": f"Technology: {technology}\n\nPatents found:\n{patents}"
            }
        ]
    )
    return response.choices[0].message.content

# Example: Research AI agent coordination patents
technology = "AI multi-agent coordination and orchestration"
patents = search_patents("multi-agent AI coordination orchestration")
analysis = analyze_patent_landscape(technology, patents)
print(analysis)

Outcome: Patent landscape intelligence that typically requires expensive IP law firm engagement, generated in minutes. Useful for product teams assessing IP risk before building.

Person taking notes while working on research with a laptop

Example 4: Systematic Review with Evidence Grading#

Use Case: Conduct a systematic review of evidence on a specific question (medical, policy, technical), grade evidence quality, and produce a structured summary suitable for decision-making.

Architecture: Multi-round search agent → evidence extraction → grading agent → synthesis agent.

Key Implementation:

from crewai import Agent, Task, Crew, Process
from crewai_tools import SerperDevTool

search_tool = SerperDevTool()

evidence_collector = Agent(
    role="Systematic Evidence Collector",
    goal="Collect all relevant evidence on: {research_question} from authoritative sources",
    backstory="""You follow systematic review methodology. You search multiple
    databases, use varied search terms, and collect evidence without bias toward
    confirming any particular conclusion.""",
    tools=[search_tool],
    llm="gpt-4o"
)

evidence_grader = Agent(
    role="Evidence Quality Assessor",
    goal="Grade the quality and reliability of each piece of evidence collected",
    backstory="""You apply evidence grading frameworks (GRADE methodology).
    You assess study design, sample size, bias risk, and relevance.
    Level 1: Systematic reviews/RCTs. Level 2: Cohort studies. Level 3: Expert opinion.""",
    llm="gpt-4o"
)

synthesis_writer = Agent(
    role="Systematic Review Synthesizer",
    goal="Synthesize graded evidence into a structured review with clear conclusions",
    backstory="""You write systematic reviews that transparently show evidence
    quality, note limitations, and draw conclusions proportionate to evidence strength.""",
    llm="gpt-4o"
)

collect_task = Task(
    description="""Search for evidence on: {research_question}
    Conduct at least 4 different search queries. Collect 10-15 sources.
    For each source record: URL, publication date, author credentials, study type.""",
    expected_output="List of 10-15 sources with metadata.",
    agent=evidence_collector
)

grade_task = Task(
    description="Grade each collected source using GRADE methodology levels 1-5.",
    expected_output="Each source with evidence grade and grading rationale.",
    agent=evidence_grader,
    context=[collect_task]
)

synthesis_task = Task(
    description="""Write a structured systematic review with:
    - Research question and scope
    - Methodology (databases searched, date range, inclusion criteria)
    - Evidence summary by grade
    - Key findings with strength of evidence
    - Limitations and gaps
    - Conclusion with confidence level""",
    expected_output="Complete systematic review in markdown format.",
    agent=synthesis_writer,
    context=[grade_task]
)

crew = Crew(
    agents=[evidence_collector, evidence_grader, synthesis_writer],
    tasks=[collect_task, grade_task, synthesis_task],
    process=Process.sequential
)

result = crew.kickoff(inputs={"research_question": "Does RAG improve LLM accuracy on domain-specific QA tasks?"})

Outcome: A methodology-compliant systematic review with explicit evidence grading, enabling evidence-based decisions on technical and business questions alike.

Example 5: Real-Time Competitive Intelligence Monitor#

Use Case: Continuously monitor competitors' public signals — job postings, press releases, patent filings, product updates — and synthesize weekly intelligence briefings.

Architecture: Scheduled trigger + parallel scrapers + AI analyst + Slack/email distribution.

Key Implementation:

import asyncio
from anthropic import Anthropic
import httpx
from datetime import datetime, timedelta

client = Anthropic()

async def monitor_competitor_signals(company: str) -> dict:
    """Collect multiple signal types for a competitor in parallel."""

    async def get_job_postings():
        # Example: LinkedIn Jobs API or Greenhouse jobs API
        async with httpx.AsyncClient() as http:
            response = await http.get(
                f"https://api.tavily.com/search",
                # Search for recent job postings as proxy for strategic priorities
                params={"query": f"{company} engineering jobs AI machine learning 2026",
                        "max_results": 5},
                headers={"Authorization": f"Bearer {TAVILY_API_KEY}"}
            )
            return response.json().get("results", [])

    async def get_press_releases():
        async with httpx.AsyncClient() as http:
            response = await http.post(
                "https://api.tavily.com/search",
                json={"query": f"{company} press release announcement product launch",
                      "topic": "news", "days": 7, "max_results": 5},
                headers={"Authorization": f"Bearer {TAVILY_API_KEY}"}
            )
            return response.json().get("results", [])

    async def get_product_updates():
        async with httpx.AsyncClient() as http:
            response = await http.post(
                "https://api.tavily.com/search",
                json={"query": f"{company} product update feature release changelog",
                      "days": 14, "max_results": 5},
                headers={"Authorization": f"Bearer {TAVILY_API_KEY}"}
            )
            return response.json().get("results", [])

    jobs, press, products = await asyncio.gather(
        get_job_postings(), get_press_releases(), get_product_updates()
    )
    return {"jobs": jobs, "press_releases": press, "product_updates": products}

async def generate_intelligence_brief(competitors: list[str]) -> str:
    """Generate a competitive intelligence brief from all signals."""
    all_signals = {}
    for company in competitors:
        all_signals[company] = await monitor_competitor_signals(company)

    response = client.messages.create(
        model="claude-3-5-sonnet-20241022",
        max_tokens=4000,
        messages=[{
            "role": "user",
            "content": f"""Generate a weekly competitive intelligence brief from these signals:
            {all_signals}

            Structure: 1) Key Moves This Week (per competitor) 2) Strategic Signals
            (what their hiring/product tells us about their roadmap) 3) Threats & Opportunities
            4) Recommended Actions for our team"""
        }]
    )
    return response.content[0].text

# Run weekly (triggered by cron/scheduler)
brief = asyncio.run(generate_intelligence_brief(["CompetitorA", "CompetitorB", "CompetitorC"]))
print(brief)

Outcome: Weekly competitive intelligence that surfaces strategic signals from job postings, announcements, and product updates — all synthesized into actionable recommendations.

Example 6: Investment Research Due Diligence Agent#

Use Case: Conduct automated preliminary due diligence on a startup or public company, aggregating information across filings, news, patents, hiring, and web presence.

Architecture: Multi-source parallel research → structured assessment → risk scoring.

Key Implementation:

from pydantic import BaseModel
from typing import List
from agents import Agent, Runner, function_tool
import httpx

@function_tool
async def search_sec_filings(company_ticker: str) -> str:
    """Search SEC EDGAR for recent filings by ticker symbol."""
    async with httpx.AsyncClient() as client:
        response = await client.get(
            "https://data.sec.gov/submissions/",
            # Simplified — real implementation would look up CIK first
        )
        return f"SEC filing search results for {company_ticker}"

@function_tool
async def search_news_sentiment(company_name: str) -> str:
    """Search recent news and assess sentiment for a company."""
    async with httpx.AsyncClient() as client:
        response = await client.post(
            "https://api.tavily.com/search",
            json={"query": f"{company_name} news funding acquisition lawsuit",
                  "topic": "news", "days": 90, "max_results": 8},
            headers={"Authorization": f"Bearer {TAVILY_API_KEY}"}
        )
        return str(response.json().get("results", []))

class DueDiligenceReport(BaseModel):
    company: str
    business_model_clarity: str
    market_opportunity: str
    team_assessment: str
    financial_signals: str
    risk_factors: List[str]
    positive_signals: List[str]
    preliminary_score: int  # 1-10
    recommendation: str

due_diligence_agent = Agent(
    name="Due Diligence Analyst",
    model="gpt-4o",
    instructions="""You are a venture capital due diligence analyst.
    Research the company thoroughly using all available tools.
    Assess: business model, market size, team credibility, financial health,
    legal/regulatory risks, competitive position, and growth signals.
    Produce a structured preliminary due diligence report.""",
    tools=[search_sec_filings, search_news_sentiment],
    output_type=DueDiligenceReport
)

import asyncio
result = asyncio.run(Runner.run(
    due_diligence_agent,
    input="Conduct preliminary due diligence on: Anthropic. Focus on market position and risk factors."
))
report = result.final_output
print(f"Score: {report.preliminary_score}/10 | {report.recommendation}")

Outcome: A preliminary due diligence report in minutes that flags key risk factors and positive signals, helping investment teams prioritize which opportunities deserve deeper manual research.

Choosing the Right Research Agent Architecture#

For open-ended questions, the ReAct architecture (Examples 1, 5) works well because the agent can dynamically decide what to search next. For structured, repeatable workflows like market research or due diligence (Examples 2, 6), a structured output type guarantees consistent, parseable reports. For multi-stage research requiring different expertise (Examples 3, 4), CrewAI's multi-agent approach distributes the work appropriately.

The Agentic RAG tutorial covers how to build research agents that incorporate private knowledge bases alongside web search.

Getting Started#

Start with the LangChain tutorial to build a basic research agent, then layer in web search tools using Tavily (pip install tavily-python). For structured output, the OpenAI Agents SDK tutorial shows how to enforce typed output schemas.

The Agentic RAG glossary entry explains how research agents differ from simple RAG systems, which is foundational to building agents that retrieve intelligently rather than just matching keywords.

Frequently Asked Questions#

The FAQ section renders from the frontmatter faq array above.

For the technical foundation of these patterns, Agentic RAG explains how agents combine retrieval and reasoning, and the Agentic RAG tutorial walks through building one from scratch.

Example 1: Academic Literature Review Agent#

Use Case: Given a research topic, systematically search academic databases, retrieve papers, extract key findings, identify research gaps, and produce a structured literature review.

Architecture: LangChain ReAct agent + Semantic Scholar API tool + arXiv API tool + PDF reader tool + structured synthesis.

Key Implementation:

from langchain_anthropic import ChatAnthropic
from langchain_core.tools import tool
from langchain.agents import create_react_agent, AgentExecutor
from langchain import hub
import httpx

@tool
def search_semantic_scholar(query: str, limit: int = 5) -> list[dict]:
    """Search Semantic Scholar for academic papers on a topic."""
    response = httpx.get(
        "https://api.semanticscholar.org/graph/v1/paper/search",
        params={
            "query": query,
            "fields": "title,abstract,year,authors,citationCount,openAccessPdf",
            "limit": limit
        }
    )
    papers = response.json().get("data", [])
    return [
        {
            "title": p["title"],
            "year": p.get("year"),
            "abstract": p.get("abstract", "")[:500],
            "citations": p.get("citationCount", 0),
            "open_access_url": p.get("openAccessPdf", {}).get("url") if p.get("openAccessPdf") else None
        }
        for p in papers
    ]

@tool
def search_arxiv(query: str, max_results: int = 5) -> list[dict]:
    """Search arXiv for preprints on a topic."""
    import xml.etree.ElementTree as ET
    response = httpx.get(
        "http://export.arxiv.org/api/query",
        params={"search_query": f"all:{query}", "max_results": max_results, "sortBy": "relevance"}
    )
    root = ET.fromstring(response.text)
    ns = {"atom": "http://www.w3.org/2005/Atom"}
    papers = []
    for entry in root.findall("atom:entry", ns):
        papers.append({
            "title": entry.find("atom:title", ns).text.strip(),
            "summary": entry.find("atom:summary", ns).text.strip()[:500],
            "published": entry.find("atom:published", ns).text[:10],
            "url": entry.find("atom:id", ns).text
        })
    return papers

@tool
def synthesize_findings(papers: list, research_question: str) -> str:
    """This tool is a signal for the agent to stop searching and begin synthesis."""
    return f"Ready to synthesize {len(papers)} papers for: {research_question}"

llm = ChatAnthropic(model="claude-3-5-sonnet-20241022")
tools = [search_semantic_scholar, search_arxiv, synthesize_findings]

agent = create_react_agent(llm=llm, tools=tools, prompt=hub.pull("hwchase17/react"))
executor = AgentExecutor(
    agent=agent,
    tools=tools,
    max_iterations=12,
    verbose=True
)

result = executor.invoke({
    "input": """Conduct a literature review on 'retrieval-augmented generation for code generation'.
    Find 8-10 key papers, identify the main approaches, current limitations, and open research questions.
    Produce a structured literature review with sections: Background, Key Approaches, Findings, Gaps."""
})
print(result["output"])

Example 2: Market Research Intelligence Agent#

Use Case: Research a target market's size, growth trends, key players, customer pain points, and regulatory environment to inform a product or market entry decision.

Architecture: Tavily web search + industry report sites + news search + structured synthesis agent.

Key Implementation:

from agents import Agent, Runner, function_tool
import httpx

@function_tool
async def search_market_data(query: str) -> str:
    """Search for market size data, growth rates, and industry analysis."""
    async with httpx.AsyncClient() as client:
        response = await client.post(
            "https://api.tavily.com/search",
            json={
                "query": query,
                "search_depth": "advanced",
                "include_domains": ["statista.com", "grandviewresearch.com",
                                   "marketsandmarkets.com", "gartner.com",
                                   "forrester.com", "mckinsey.com"],
                "max_results": 5
            },
            headers={"Authorization": f"Bearer {TAVILY_API_KEY}"}
        )
        results = response.json()["results"]
        return "\n\n".join([f"Source: {r['url']}\n{r['content'][:400]}" for r in results])

@function_tool
async def search_recent_news(topic: str) -> str:
    """Search for recent news and developments about a market or company."""
    async with httpx.AsyncClient() as client:
        response = await client.post(
            "https://api.tavily.com/search",
            json={"query": topic, "topic": "news", "days": 90, "max_results": 5},
            headers={"Authorization": f"Bearer {TAVILY_API_KEY}"}
        )
        results = response.json()["results"]
        return "\n".join([f"[{r['published_date']}] {r['title']}: {r['content'][:200]}" for r in results])

from pydantic import BaseModel
from typing import List

class MarketResearchReport(BaseModel):
    market_size_usd_billion: str
    cagr_percent: str
    key_players: List[str]
    customer_pain_points: List[str]
    market_trends: List[str]
    regulatory_considerations: str
    market_entry_attractiveness: str
    confidence_level: str

market_researcher = Agent(
    name="Market Research Analyst",
    model="gpt-4o",
    instructions="""You are a senior market research analyst. Research the specified market
    thoroughly using all available tools. Gather: market size, growth rate, key players,
    customer pain points, recent trends, and regulatory environment.
    Use multiple searches to ensure comprehensive coverage.
    Output a structured market research report.""",
    tools=[search_market_data, search_recent_news],
    output_type=MarketResearchReport
)

import asyncio
async def main():
    result = await Runner.run(
        market_researcher,
        input="Research the AI-powered customer service software market. Focus on SMB segment, North America and Europe."
    )
    report = result.final_output
    print(f"Market Size: ${report.market_size_usd_billion}B")
    print(f"Growth Rate: {report.cagr_percent}% CAGR")
    print(f"Key Players: {', '.join(report.key_players)}")

asyncio.run(main())

Example 3: Patent Landscape Analysis Agent#

Use Case: Map the patent landscape for a technology area to identify key IP holders, recent filings, white spaces, and potential freedom-to-operate risks.

Architecture: USPTO API + Google Patents search + AI analyst for pattern detection.

Key Implementation:

from openai import OpenAI
import httpx

client = OpenAI()

def search_patents(query: str, date_from: str = "2020-01-01") -> list[dict]:
    """Search USPTO patents via PatentsView API."""
    response = httpx.post(
        "https://search.patentsview.org/api/v1/patent/",
        json={
            "q": {"_text_any": {"patent_abstract": query}},
            "f": ["patent_number", "patent_title", "patent_abstract",
                  "assignee_organization", "patent_date"],
            "o": {"patent_date": "desc"},
            "s": [{"patent_date": "desc"}]
        }
    )
    data = response.json()
    return data.get("patents", [])[:20]

def analyze_patent_landscape(technology: str, patents: list[dict]) -> str:
    """AI analysis of patent landscape patterns."""
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {
                "role": "system",
                "content": """You are a patent analyst and IP strategist.
                Analyze the provided patent data and identify:
                1. Top assignees by patent count and their strategic focus
                2. Key technical approaches being patented
                3. Temporal trends (increasing/decreasing filings)
                4. White spaces (what's NOT being patented)
                5. Freedom-to-operate considerations for new entrants
                6. Licensing opportunity assessment
                Provide a structured IP landscape report."""
            },
            {
                "role": "user",
                "content": f"Technology: {technology}\n\nPatents found:\n{patents}"
            }
        ]
    )
    return response.choices[0].message.content

# Example: Research AI agent coordination patents
technology = "AI multi-agent coordination and orchestration"
patents = search_patents("multi-agent AI coordination orchestration")
analysis = analyze_patent_landscape(technology, patents)
print(analysis)

Outcome: Patent landscape intelligence that typically requires expensive IP law firm engagement, generated in minutes. Useful for product teams assessing IP risk before building.

Person taking notes while working on research with a laptop

Example 4: Systematic Review with Evidence Grading#

Use Case: Conduct a systematic review of evidence on a specific question (medical, policy, technical), grade evidence quality, and produce a structured summary suitable for decision-making.

Architecture: Multi-round search agent → evidence extraction → grading agent → synthesis agent.

Key Implementation:

from crewai import Agent, Task, Crew, Process
from crewai_tools import SerperDevTool

search_tool = SerperDevTool()

evidence_collector = Agent(
    role="Systematic Evidence Collector",
    goal="Collect all relevant evidence on: {research_question} from authoritative sources",
    backstory="""You follow systematic review methodology. You search multiple
    databases, use varied search terms, and collect evidence without bias toward
    confirming any particular conclusion.""",
    tools=[search_tool],
    llm="gpt-4o"
)

evidence_grader = Agent(
    role="Evidence Quality Assessor",
    goal="Grade the quality and reliability of each piece of evidence collected",
    backstory="""You apply evidence grading frameworks (GRADE methodology).
    You assess study design, sample size, bias risk, and relevance.
    Level 1: Systematic reviews/RCTs. Level 2: Cohort studies. Level 3: Expert opinion.""",
    llm="gpt-4o"
)

synthesis_writer = Agent(
    role="Systematic Review Synthesizer",
    goal="Synthesize graded evidence into a structured review with clear conclusions",
    backstory="""You write systematic reviews that transparently show evidence
    quality, note limitations, and draw conclusions proportionate to evidence strength.""",
    llm="gpt-4o"
)

collect_task = Task(
    description="""Search for evidence on: {research_question}
    Conduct at least 4 different search queries. Collect 10-15 sources.
    For each source record: URL, publication date, author credentials, study type.""",
    expected_output="List of 10-15 sources with metadata.",
    agent=evidence_collector
)

grade_task = Task(
    description="Grade each collected source using GRADE methodology levels 1-5.",
    expected_output="Each source with evidence grade and grading rationale.",
    agent=evidence_grader,
    context=[collect_task]
)

synthesis_task = Task(
    description="""Write a structured systematic review with:
    - Research question and scope
    - Methodology (databases searched, date range, inclusion criteria)
    - Evidence summary by grade
    - Key findings with strength of evidence
    - Limitations and gaps
    - Conclusion with confidence level""",
    expected_output="Complete systematic review in markdown format.",
    agent=synthesis_writer,
    context=[grade_task]
)

crew = Crew(
    agents=[evidence_collector, evidence_grader, synthesis_writer],
    tasks=[collect_task, grade_task, synthesis_task],
    process=Process.sequential
)

result = crew.kickoff(inputs={"research_question": "Does RAG improve LLM accuracy on domain-specific QA tasks?"})

Outcome: A methodology-compliant systematic review with explicit evidence grading, enabling evidence-based decisions on technical and business questions alike.

Example 5: Real-Time Competitive Intelligence Monitor#

Use Case: Continuously monitor competitors' public signals — job postings, press releases, patent filings, product updates — and synthesize weekly intelligence briefings.

Architecture: Scheduled trigger + parallel scrapers + AI analyst + Slack/email distribution.

Key Implementation:

import asyncio
from anthropic import Anthropic
import httpx
from datetime import datetime, timedelta

client = Anthropic()

async def monitor_competitor_signals(company: str) -> dict:
    """Collect multiple signal types for a competitor in parallel."""

    async def get_job_postings():
        # Example: LinkedIn Jobs API or Greenhouse jobs API
        async with httpx.AsyncClient() as http:
            response = await http.get(
                f"https://api.tavily.com/search",
                # Search for recent job postings as proxy for strategic priorities
                params={"query": f"{company} engineering jobs AI machine learning 2026",
                        "max_results": 5},
                headers={"Authorization": f"Bearer {TAVILY_API_KEY}"}
            )
            return response.json().get("results", [])

    async def get_press_releases():
        async with httpx.AsyncClient() as http:
            response = await http.post(
                "https://api.tavily.com/search",
                json={"query": f"{company} press release announcement product launch",
                      "topic": "news", "days": 7, "max_results": 5},
                headers={"Authorization": f"Bearer {TAVILY_API_KEY}"}
            )
            return response.json().get("results", [])

    async def get_product_updates():
        async with httpx.AsyncClient() as http:
            response = await http.post(
                "https://api.tavily.com/search",
                json={"query": f"{company} product update feature release changelog",
                      "days": 14, "max_results": 5},
                headers={"Authorization": f"Bearer {TAVILY_API_KEY}"}
            )
            return response.json().get("results", [])

    jobs, press, products = await asyncio.gather(
        get_job_postings(), get_press_releases(), get_product_updates()
    )
    return {"jobs": jobs, "press_releases": press, "product_updates": products}

async def generate_intelligence_brief(competitors: list[str]) -> str:
    """Generate a competitive intelligence brief from all signals."""
    all_signals = {}
    for company in competitors:
        all_signals[company] = await monitor_competitor_signals(company)

    response = client.messages.create(
        model="claude-3-5-sonnet-20241022",
        max_tokens=4000,
        messages=[{
            "role": "user",
            "content": f"""Generate a weekly competitive intelligence brief from these signals:
            {all_signals}

            Structure: 1) Key Moves This Week (per competitor) 2) Strategic Signals
            (what their hiring/product tells us about their roadmap) 3) Threats & Opportunities
            4) Recommended Actions for our team"""
        }]
    )
    return response.content[0].text

# Run weekly (triggered by cron/scheduler)
brief = asyncio.run(generate_intelligence_brief(["CompetitorA", "CompetitorB", "CompetitorC"]))
print(brief)

Outcome: Weekly competitive intelligence that surfaces strategic signals from job postings, announcements, and product updates — all synthesized into actionable recommendations.

Example 6: Investment Research Due Diligence Agent#

Use Case: Conduct automated preliminary due diligence on a startup or public company, aggregating information across filings, news, patents, hiring, and web presence.

Architecture: Multi-source parallel research → structured assessment → risk scoring.

Key Implementation:

from pydantic import BaseModel
from typing import List
from agents import Agent, Runner, function_tool
import httpx

@function_tool
async def search_sec_filings(company_ticker: str) -> str:
    """Search SEC EDGAR for recent filings by ticker symbol."""
    async with httpx.AsyncClient() as client:
        response = await client.get(
            "https://data.sec.gov/submissions/",
            # Simplified — real implementation would look up CIK first
        )
        return f"SEC filing search results for {company_ticker}"

@function_tool
async def search_news_sentiment(company_name: str) -> str:
    """Search recent news and assess sentiment for a company."""
    async with httpx.AsyncClient() as client:
        response = await client.post(
            "https://api.tavily.com/search",
            json={"query": f"{company_name} news funding acquisition lawsuit",
                  "topic": "news", "days": 90, "max_results": 8},
            headers={"Authorization": f"Bearer {TAVILY_API_KEY}"}
        )
        return str(response.json().get("results", []))

class DueDiligenceReport(BaseModel):
    company: str
    business_model_clarity: str
    market_opportunity: str
    team_assessment: str
    financial_signals: str
    risk_factors: List[str]
    positive_signals: List[str]
    preliminary_score: int  # 1-10
    recommendation: str

due_diligence_agent = Agent(
    name="Due Diligence Analyst",
    model="gpt-4o",
    instructions="""You are a venture capital due diligence analyst.
    Research the company thoroughly using all available tools.
    Assess: business model, market size, team credibility, financial health,
    legal/regulatory risks, competitive position, and growth signals.
    Produce a structured preliminary due diligence report.""",
    tools=[search_sec_filings, search_news_sentiment],
    output_type=DueDiligenceReport
)

import asyncio
result = asyncio.run(Runner.run(
    due_diligence_agent,
    input="Conduct preliminary due diligence on: Anthropic. Focus on market position and risk factors."
))
report = result.final_output
print(f"Score: {report.preliminary_score}/10 | {report.recommendation}")

Outcome: A preliminary due diligence report in minutes that flags key risk factors and positive signals, helping investment teams prioritize which opportunities deserve deeper manual research.

Choosing the Right Research Agent Architecture#

The Agentic RAG tutorial covers how to build research agents that incorporate private knowledge bases alongside web search.

Getting Started#

The Agentic RAG glossary entry explains how research agents differ from simple RAG systems, which is foundational to building agents that retrieve intelligently rather than just matching keywords.

Frequently Asked Questions#

The FAQ section renders from the frontmatter faq array above.