Content marketing teams spend hours on every article ā researching the topic, generating an outline, writing a draft, fact-checking, and editing for clarity and SEO. A content writing AI agent can compress that timeline from hours to minutes by running a coordinated pipeline of specialized agents, each doing exactly one job well.
In this tutorial you will build a three-agent content pipeline using CrewAI. The crew includes a Researcher agent that gathers current information on the topic, a Writer agent that produces an SEO-optimized draft, and an Editor agent that refines the prose and ensures quality. You will get full working code, real output examples, and a production deployment checklist.
Prerequisites#
Before you start, ensure you have:
- Python 3.10 or later
- A Tavily API key (for web research ā get one free at tavily.com)
- An OpenAI API key
- Familiarity with Python classes and basic agent concepts
Install dependencies:
pip install crewai crewai-tools tavily-python python-dotenv openai
Create a .env file:
OPENAI_API_KEY=sk-...
TAVILY_API_KEY=tvly-...
Architecture Overview#
CrewAI uses a role-based multi-agent architecture. Each agent has a defined role, backstory, and set of tools. A Crew coordinates agents through a sequential or hierarchical process.
For this content pipeline:
User Input (topic + target keyword)
ā
ā¼
āāāāāāāāāāāāāāāāā
ā RESEARCHER ā ā Tavily Search Tool, SEO Serp Tool
ā Agent ā Gathers facts, stats, competitor angles
āāāāāāāāā¬āāāāāāāā
ā Research report
ā¼
āāāāāāāāāāāāāāāāā
ā WRITER ā ā No external tools (uses context from researcher)
ā Agent ā Produces SEO-optimized 1500-word draft
āāāāāāāāā¬āāāāāāāā
ā Raw draft
ā¼
āāāāāāāāāāāāāāāāā
ā EDITOR ā ā No external tools
ā Agent ā Refines prose, checks facts, improves SEO
āāāāāāāāā¬āāāāāāāā
ā
ā¼
Final Article (markdown)
The sequential process means each agent's output feeds directly into the next agent's context. CrewAI handles the prompt chaining automatically.
Step 1: Define the Research Tools#
The Researcher agent uses two tools: a Tavily web search for current information and a custom SEO analysis tool that examines competitor headings.
# tools/research_tools.py
from crewai_tools import TavilySearchTool
from langchain.tools import tool
import json
# Built-in CrewAI wrapper for Tavily
tavily_tool = TavilySearchTool()
@tool("SEO Competitor Analysis")
def seo_analysis_tool(keyword: str) -> str:
"""
Analyze top-ranking content for a keyword to identify common headings,
content gaps, and SEO opportunities.
Args:
keyword: The target SEO keyword to analyze.
Returns:
A JSON string with common H2 headings, average word count estimate,
and suggested content angles.
"""
# In production, this would call a real SERP API like DataForSEO or SerpApi.
# Here we use Tavily to approximate competitor research.
from tavily import TavilyClient
import os
client = TavilyClient(api_key=os.getenv("TAVILY_API_KEY"))
results = client.search(
query=f"{keyword} complete guide",
search_depth="advanced",
max_results=5,
include_raw_content=False,
)
competitor_data = {
"keyword": keyword,
"top_results": [
{
"title": r.get("title"),
"url": r.get("url"),
"snippet": r.get("content", "")[:200],
}
for r in results.get("results", [])
],
"seo_insights": (
"Focus on practical examples, step-by-step structure, and "
"address common pain points. Include FAQ section targeting "
"People Also Ask results."
),
}
return json.dumps(competitor_data, indent=2)
Step 2: Define the Three Agents#
Each agent has a concise role, a detailed goal, and a backstory that shapes its persona and writing style.
# agents/content_agents.py
from crewai import Agent
from langchain_openai import ChatOpenAI
from tools.research_tools import tavily_tool, seo_analysis_tool
# Shared LLM ā use GPT-4o for quality, GPT-4o-mini for cost optimization
llm = ChatOpenAI(model="gpt-4o", temperature=0.3)
researcher = Agent(
role="Senior Content Researcher",
goal=(
"Research the given topic thoroughly. Find current statistics, expert opinions, "
"real-world examples, and identify the key questions the audience is asking. "
"Produce a structured research brief that the writer can use directly."
),
backstory=(
"You are an experienced digital research analyst with a background in content marketing. "
"You know how to identify the most credible, up-to-date sources and extract insights "
"that make articles genuinely useful rather than generic. You never fabricate statistics."
),
tools=[tavily_tool, seo_analysis_tool],
llm=llm,
verbose=True,
allow_delegation=False,
)
writer = Agent(
role="SEO Content Writer",
goal=(
"Using the research brief, write a comprehensive, SEO-optimized article of at least "
"1,500 words. The article must include a compelling introduction, clear H2/H3 structure, "
"the target keyword in the first 100 words and in at least two H2 headings, "
"practical examples, and a FAQ section."
),
backstory=(
"You are a senior content writer who has written for leading SaaS and tech publications. "
"Your articles rank because they combine deep expertise with exceptional clarity. "
"You write for humans first, search engines second."
),
tools=[],
llm=llm,
verbose=True,
allow_delegation=False,
)
editor = Agent(
role="Content Editor and SEO Specialist",
goal=(
"Review the draft article for clarity, factual accuracy, SEO optimization, and readability. "
"Fix passive voice, improve transitions, ensure the keyword density is natural (1-2%), "
"add or improve internal linking suggestions, and produce a final publication-ready version."
),
backstory=(
"You are a meticulous editor with 10 years of experience in digital publishing. "
"You have an eye for weak arguments, vague claims, and missed SEO opportunities. "
"Your edits make good articles great without losing the author's voice."
),
tools=[],
llm=llm,
verbose=True,
allow_delegation=False,
)
Step 3: Define the Tasks#
CrewAI tasks give each agent a specific deliverable with clear expected output. The context parameter creates the data dependency between agents.
# tasks/content_tasks.py
from crewai import Task
from agents.content_agents import researcher, writer, editor
def create_content_tasks(topic: str, target_keyword: str, target_audience: str):
research_task = Task(
description=f"""
Research the following topic thoroughly:
- Topic: {topic}
- Target Keyword: {target_keyword}
- Target Audience: {target_audience}
Your deliverables:
1. Use the Tavily search tool to find 5+ current, credible sources.
2. Use the SEO Competitor Analysis tool to understand what top-ranking content covers.
3. Compile a structured research brief including:
- Key facts and statistics (with sources)
- Common audience questions and pain points
- 5-7 recommended H2 section headings
- 3 unique angles or insights not covered by competitors
""",
expected_output=(
"A structured markdown research brief with sections: Key Facts & Stats, "
"Audience Pain Points, Recommended Outline, Competitor Gaps."
),
agent=researcher,
)
writing_task = Task(
description=f"""
Using the research brief provided, write a complete SEO-optimized article.
Requirements:
- Target keyword: {target_keyword} (use naturally in intro, 2+ H2s, conclusion)
- Length: 1,500-2,000 words
- Structure: H1 title, introduction (150 words), 5-7 H2 sections, FAQ (4+ questions), conclusion
- Tone: Expert but accessible, practical, actionable
- Include real examples and concrete code or workflow examples where relevant
- End with a strong call to action
""",
expected_output=(
"A complete markdown article ready for editorial review, including title, "
"all sections, FAQ, and conclusion."
),
agent=writer,
context=[research_task],
)
editing_task = Task(
description=f"""
Edit and finalize the draft article for publication.
Your editorial checklist:
1. Fix grammar, spelling, and punctuation errors
2. Improve sentence variety and eliminate passive voice where possible
3. Verify keyword '{target_keyword}' appears naturally in intro, headings, and conclusion
4. Suggest 3+ internal links as [anchor text](/suggested-path/) placeholders
5. Confirm FAQ answers are complete and accurate based on the research
6. Add a meta description (155 characters max) at the top of your output
7. Add a suggested title tag (60 characters max) at the top of your output
Return the complete, edited article in markdown format.
""",
expected_output=(
"Publication-ready markdown article with meta description, title tag, "
"all edits applied, and internal link suggestions."
),
agent=editor,
context=[research_task, writing_task],
)
return [research_task, writing_task, editing_task]
Step 4: Assemble and Run the Crew#
# main.py
import os
from dotenv import load_dotenv
from crewai import Crew, Process
from agents.content_agents import researcher, writer, editor
from tasks.content_tasks import create_content_tasks
load_dotenv()
def generate_article(
topic: str,
target_keyword: str,
target_audience: str = "marketing professionals"
) -> str:
"""
Run the full content creation crew and return the final article.
"""
tasks = create_content_tasks(topic, target_keyword, target_audience)
crew = Crew(
agents=[researcher, writer, editor],
tasks=tasks,
process=Process.sequential,
verbose=True,
)
result = crew.kickoff()
return str(result)
if __name__ == "__main__":
article = generate_article(
topic="How AI agents are transforming B2B content marketing in 2026",
target_keyword="AI agents content marketing",
target_audience="B2B marketing managers and content strategists",
)
# Save to file
with open("output/article_draft.md", "w") as f:
f.write(article)
print("\n--- FINAL ARTICLE ---\n")
print(article[:2000]) # Preview first 2000 characters
print("\n[Full article saved to output/article_draft.md]")
Photo by Unseen Studio on Unsplash
Sample Output#
When you run the crew against the topic above, you get output similar to this:
--- CREW EXECUTION LOG ---
[Researcher] Starting research on: AI agents content marketing
[Researcher] Calling Tavily Search: "AI agents B2B content marketing 2026 statistics"
[Researcher] Calling SEO Analysis: "AI agents content marketing"
[Researcher] Compiled research brief (847 words)
[Writer] Starting article draft based on research brief
[Writer] Draft complete: 1,743 words, 6 H2 sections, 5 FAQ items
[Editor] Reviewing draft...
[Editor] Applied 23 edits. Added 4 internal link suggestions.
[Editor] Final article: 1,698 words (after tightening), keyword density: 1.4%
--- FINAL OUTPUT ---
Meta description: Learn how AI agents are transforming B2B content marketing...
Title tag: AI Agents in Content Marketing: 2026 Complete Guide
# AI Agents in Content Marketing: The Complete 2026 Guide
...
Total execution time for a 1,600-word article: approximately 90ā120 seconds.
Customizing the Pipeline#
Adjusting tone: Modify the writer agent's backstory to target different publication styles. For a technical developer audience, emphasize precision over narrative flow. For a consumer brand, emphasize conversational language.
Adding a fact-checker agent: Insert a fourth agent between the writer and editor that cross-references claims against the research brief:
fact_checker = Agent(
role="Fact-Checker",
goal="Verify every statistic and claim in the draft against the research brief. Flag unsupported claims.",
backstory="You are a rigorous fact-checker with a journalism background.",
tools=[tavily_tool],
llm=llm,
verbose=True,
allow_delegation=False,
)
Batch processing: To generate content at scale, wrap the generate_article() call in a loop over a topics list and add rate limiting with time.sleep(5) between runs to stay within API quotas.
Saving to a CMS: After generation, post the article directly to WordPress, Webflow, or your headless CMS using their REST APIs. The output is already in markdown ā most CMSes accept it directly.
Production Considerations#
Token costs: A full crew run for a 1,500-word article uses approximately 15,000ā25,000 tokens with GPT-4o. At current pricing this is around $0.10ā$0.20 per article. For high-volume use, switch the writer and editor to gpt-4o-mini and reserve gpt-4o for the researcher only.
Content quality guardrails: Add a post-processing validation step that checks minimum word count, keyword presence, and FAQ completeness before saving:
def validate_article(article: str, keyword: str) -> dict:
word_count = len(article.split())
keyword_count = article.lower().count(keyword.lower())
has_faq = "## frequently asked" in article.lower() or "## faq" in article.lower()
return {
"word_count": word_count,
"keyword_count": keyword_count,
"has_faq": has_faq,
"passes_validation": word_count >= 1200 and keyword_count >= 3 and has_faq,
}
Idempotency: Store a hash of each (topic, keyword) pair so you never generate duplicate content. Use a simple SQLite database or a key-value store like Redis.
Further Reading#
- Getting Started with AI Agents ā Build your first LangChain agent from scratch
- How to Build a Meeting Scheduler AI Agent ā Another practical automation build
- How to Automate Invoicing with AI Agents ā Expand automation to finance workflows
- CrewAI vs LangChain: Which Framework Should You Use? ā Framework comparison to guide your architecture choices
- AI Agents Use Cases ā Discover more domains where AI agents add value
- Tavily API Integration Guide ā Deep dive into using Tavily for real-time research
Frequently Asked Questions#
Can this pipeline generate content in languages other than English? Yes. Update the writer agent's goal to specify the target language, and CrewAI will produce output in that language. The Tavily search tool also supports multi-language queries ā pass the language parameter in the search call.
How do I prevent the agents from fabricating statistics? Two approaches work well in practice. First, the researcher agent's backstory explicitly states "never fabricate statistics." Second, the editor agent is instructed to flag unsupported claims. For highest fidelity, add a fourth fact-checker agent with access to Tavily to verify claims before final output.
What is the difference between CrewAI sequential and hierarchical process? In sequential process, agents run one after another and each receives the prior agent's output as context. In hierarchical process, a manager LLM dynamically decides which agent to delegate to and can route tasks based on intermediate results. For a linear content pipeline like this one, sequential is simpler and more predictable.
Can I run this without OpenAI using a local model?
Yes. Replace ChatOpenAI with ChatOllama from langchain_community and point it at a locally running Ollama instance with a capable model like llama3.1:70b. Note that smaller models produce noticeably lower quality content; 70B+ parameter models are recommended for production content generation.