AI Agent for HR & Recruitment: Resume Screening, Candidate Matching & Scheduling
Recruiters spend an average of 23 hours screening resumes for a single hire. An AI recruitment agent can reduce that to minutes — screening hundreds of resumes, matching candidates to requirements, and scheduling interviews automatically. This tutorial shows you how to build one responsibly.
What You'll Learn#
- Designing an AI recruitment agent workflow
- Automated resume screening with bias-aware scoring
- Candidate-to-role matching with semantic similarity
- Interview scheduling automation
- Compliance guardrails (EEOC, GDPR, local regulations)
Prerequisites#
- Understanding of AI agent architecture
- Familiarity with RAG concepts for document processing
- Basic knowledge of prompt engineering
- Understanding of recruitment processes
The Recruitment Agent Architecture#
Resume / Application Received
│
▼
┌──────────────────┐
│ Resume Parser │
│ (Extract data) │
└────────┬─────────┘
│
▼
┌──────────────────┐
│ Candidate Scorer │
│ (Skills match, │
│ experience fit) │
└────────┬─────────┘
│
┌───────┼───────┐
▼ ▼ ▼
Score A Score B Score C
(High) (Medium) (Low)
│ │ │
▼ ▼ ▼
Schedule Talent Polite
Interview Pool Rejection
│ │
▼ ▼
┌──────────────────┐
│ Compliance Check │
│ (Bias audit, EEO) │
└──────────────────┘
Step 1: Resume Parsing#
Extract structured data from unstructured resumes:
from openai import OpenAI
import json
client = OpenAI()
def parse_resume(resume_text: str) -> dict:
"""Extract structured data from resume text."""
response = client.chat.completions.create(
model="gpt-4o",
messages=[{
"role": "system",
"content": """Extract structured data from the resume.
Return JSON with these fields:
{
"name": "Full name",
"email": "Email address",
"phone": "Phone number",
"location": "City, State/Country",
"summary": "2-3 sentence professional summary",
"years_experience": number,
"current_title": "Most recent job title",
"skills": {
"technical": ["skill1", "skill2"],
"soft": ["skill1", "skill2"],
"certifications": ["cert1", "cert2"]
},
"experience": [
{
"title": "Job title",
"company": "Company name",
"duration_years": number,
"key_achievements": ["achievement1"]
}
],
"education": [
{
"degree": "Degree type",
"field": "Field of study",
"institution": "School name",
"year": number
}
]
}
IMPORTANT: Only extract information explicitly stated.
Never infer or assume details not present in the resume.
Use null for missing fields."""
}, {
"role": "user",
"content": f"Parse this resume:\n\n{resume_text}"
}],
temperature=0,
response_format={"type": "json_object"}
)
return json.loads(response.choices[0].message.content)
Handling Different Resume Formats#
| Format | Approach | |--------|----------| | PDF | Extract text with PyPDF2 or pdfplumber | | DOCX | Use python-docx to extract text | | Image/Scan | OCR with Tesseract or cloud vision APIs | | LinkedIn profile | Parse structured HTML or use API | | Plain text | Direct processing |
def extract_resume_text(file_path: str) -> str:
"""Extract text from various resume formats."""
if file_path.endswith('.pdf'):
import pdfplumber
with pdfplumber.open(file_path) as pdf:
return "\n".join(
page.extract_text() or ""
for page in pdf.pages
)
elif file_path.endswith('.docx'):
from docx import Document
doc = Document(file_path)
return "\n".join(p.text for p in doc.paragraphs)
elif file_path.endswith('.txt'):
with open(file_path, 'r') as f:
return f.read()
else:
raise ValueError(f"Unsupported format: {file_path}")
Step 2: Candidate Scoring#
Score candidates against job requirements using a structured rubric:
def create_scoring_rubric(job_description: dict) -> str:
"""Generate a scoring rubric from job requirements."""
return f"""Score the candidate on a 1-10 scale based on
these weighted criteria:
## Required Skills (40% weight)
Must-have skills: {', '.join(job_description['required_skills'])}
- 10: Has all required skills with proven expertise
- 7: Has most required skills (80%+)
- 4: Has some required skills (50-80%)
- 1: Missing most required skills
## Experience Level (30% weight)
Required: {job_description['min_years_experience']}+ years
Target: {job_description['ideal_years_experience']} years
- 10: Meets or exceeds ideal years in matching roles
- 7: Meets minimum with relevant industry experience
- 4: Below minimum but shows strong potential
- 1: Significantly below minimum
## Nice-to-Have Skills (15% weight)
Preferred: {', '.join(job_description['preferred_skills'])}
- 10: Has 80%+ of nice-to-have skills
- 5: Has 40-80% of nice-to-have skills
- 1: Has few or none
## Education & Certifications (15% weight)
Required: {job_description.get('education', 'Not specified')}
Preferred certs: {', '.join(job_description.get('preferred_certs', []))}
- 10: Exceeds education requirements with relevant certs
- 7: Meets education requirements
- 4: Related but different education
- 1: Does not meet requirements
IMPORTANT BIAS RULES:
- Do NOT factor in: name, age, gender, ethnicity, photos,
university prestige (only check degree relevance), or
personal interests
- DO factor in: skills, measurable achievements, relevant
experience, certifications
- Score ONLY on job-relevant qualifications
"""
def score_candidate(
parsed_resume: dict,
job_description: dict
) -> dict:
"""Score a candidate against job requirements."""
rubric = create_scoring_rubric(job_description)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{
"role": "system",
"content": f"""You are an unbiased recruitment
screening assistant.
{rubric}
Return JSON:
{{
"overall_score": <1-10 weighted average>,
"category_scores": {{
"required_skills": <1-10>,
"experience": <1-10>,
"nice_to_have": <1-10>,
"education": <1-10>
}},
"matching_skills": ["skill1", "skill2"],
"missing_skills": ["skill1", "skill2"],
"strengths": ["strength1", "strength2"],
"concerns": ["concern1"],
"recommendation": "advance" | "talent_pool" | "reject",
"reasoning": "<2-3 sentences>"
}}"""
}, {
"role": "user",
"content": f"""Score this candidate:
{json.dumps(parsed_resume, indent=2)}
For this role:
Title: {job_description['title']}
Department: {job_description['department']}"""
}],
temperature=0,
response_format={"type": "json_object"}
)
return json.loads(response.choices[0].message.content)
Step 3: Semantic Candidate Matching#
Go beyond keyword matching — use embeddings to find candidates whose experience semantically aligns with the role:
def semantic_match_score(
resume_summary: str,
job_description_text: str
) -> float:
"""Calculate semantic similarity between candidate
and job using embeddings."""
embeddings = client.embeddings.create(
model="text-embedding-3-small",
input=[resume_summary, job_description_text]
)
import numpy as np
vec_resume = np.array(embeddings.data[0].embedding)
vec_job = np.array(embeddings.data[1].embedding)
# Cosine similarity
similarity = np.dot(vec_resume, vec_job) / (
np.linalg.norm(vec_resume) * np.linalg.norm(vec_job)
)
return float(similarity)
Combining Scores#
def final_candidate_score(
rubric_score: float,
semantic_score: float,
weights: dict = None
) -> float:
"""Combine rubric and semantic scores."""
if weights is None:
weights = {"rubric": 0.7, "semantic": 0.3}
# Normalize semantic score to 1-10 scale
semantic_normalized = semantic_score * 10
return (
weights["rubric"] * rubric_score +
weights["semantic"] * semantic_normalized
)
Step 4: Interview Scheduling Agent#
Automate the back-and-forth of scheduling:
@tool
def get_interviewer_availability(
interviewer_email: str,
date_range_days: int = 7
) -> list:
"""Check calendar availability for a interviewer.
Args:
interviewer_email: The interviewer's email
date_range_days: Number of days to look ahead
"""
# Production: Google Calendar or Outlook Calendar API
# Returns list of available 1-hour slots
from datetime import datetime, timedelta
slots = []
start_date = datetime.now()
for day in range(date_range_days):
date = start_date + timedelta(days=day)
if date.weekday() < 5: # Weekdays only
for hour in [9, 10, 11, 14, 15, 16]: # Business hours
slots.append({
"date": date.strftime("%Y-%m-%d"),
"time": f"{hour}:00",
"duration": "1 hour",
"interviewer": interviewer_email
})
return slots
@tool
def send_scheduling_email(
candidate_email: str,
candidate_name: str,
role_title: str,
available_slots: list,
interview_type: str
) -> str:
"""Send a scheduling email with available time slots.
Args:
candidate_email: Candidate's email address
candidate_name: Candidate's full name
role_title: The position they're interviewing for
available_slots: List of available time slots
interview_type: Type of interview (phone, video, onsite)
"""
# Generate personalized scheduling email
email_body = f"""
Dear {candidate_name},
Thank you for your interest in the {role_title} position.
We'd like to invite you to a {interview_type} interview.
Please select one of the following available times:
"""
for i, slot in enumerate(available_slots[:5], 1):
email_body += (
f"{i}. {slot['date']} at {slot['time']} "
f"({slot['duration']})\n"
)
email_body += """
Please reply with your preferred time, and we'll send
a calendar invitation.
Best regards,
[Company] Recruitment Team
"""
# Production: send via email API
return f"Scheduling email sent to {candidate_email}"
@tool
def create_calendar_event(
candidate_email: str,
interviewer_email: str,
datetime_str: str,
interview_type: str,
role_title: str
) -> str:
"""Create a calendar event for the interview.
Args:
candidate_email: Candidate's email
interviewer_email: Interviewer's email
datetime_str: Interview date and time
interview_type: Type of interview
role_title: Position being interviewed for
"""
# Production: create Google/Outlook calendar event
return (f"Calendar event created: {interview_type} interview "
f"for {role_title} on {datetime_str}")
Step 5: Compliance & Bias Safeguards#
THIS IS CRITICAL. HR AI systems must be fair and compliant.
Bias Prevention Rules#
BIAS_PREVENTION_RULES = """
## DO NOT consider (remove from scoring if present):
- Name (can reveal gender, ethnicity, nationality)
- Age, date of birth, graduation year
- Gender, marital status, family situation
- Ethnicity, race, national origin
- Photo or physical appearance
- University prestige ranking
- Address / neighborhood
- Hobbies or personal interests
- Number of social media followers
## DO consider:
- Relevant skills and proficiencies
- Years of relevant experience
- Measurable achievements and impact
- Relevant education (degree type and field only)
- Professional certifications
- Career progression trajectory
- Technical assessments / portfolio quality
"""
Compliance Checklist#
| Requirement | Implementation | |-------------|---------------| | EEOC compliance (US) | Remove protected class data before scoring | | GDPR (EU) | Explicit consent, data minimization, right to explanation | | NYC Local Law 144 | Annual bias audit for automated employment tools | | Illinois AIPA | Notify candidates that AI is used in evaluation | | Adverse impact testing | Monitor scoring distribution across demographics |
Bias Audit System#
def run_bias_audit(scoring_results: list) -> dict:
"""Audit scoring results for potential bias."""
# Group scores by demographic indicators (if tracked)
# WARNING: Demographic data should be collected separately
# and NEVER used in scoring — only for audit purposes
from collections import defaultdict
import statistics
# Check score distribution
all_scores = [r["overall_score"] for r in scoring_results]
audit = {
"total_candidates": len(scoring_results),
"mean_score": statistics.mean(all_scores),
"std_deviation": statistics.stdev(all_scores),
"score_distribution": {
"advance (7-10)": sum(1 for s in all_scores if s >= 7),
"talent_pool (4-6)": sum(1 for s in all_scores if 4 <= s < 7),
"reject (1-3)": sum(1 for s in all_scores if s < 4),
},
"recommendation": (
"PASS" if statistics.stdev(all_scores) < 3
else "REVIEW — high variance in scores"
)
}
return audit
Candidate Communication#
def generate_rejection_email(
candidate_name: str,
role_title: str
) -> str:
"""Generate a respectful rejection email."""
response = client.chat.completions.create(
model="gpt-4o",
messages=[{
"role": "system",
"content": """Write a professional, empathetic rejection
email. Rules:
- Express genuine appreciation for their interest
- Do NOT mention specific shortcomings
- Encourage them to apply for future roles
- Keep under 100 words
- Warm, human tone"""
}, {
"role": "user",
"content": (f"Candidate: {candidate_name}, "
f"Role: {role_title}")
}],
temperature=0.5
)
return response.choices[0].message.content
Step 6: Pipeline Dashboard#
Track your recruitment agent's performance:
| Metric | Formula | Target | |--------|---------|--------| | Screening speed | Resumes processed per hour | 100+ | | Pass-through rate | Candidates advanced / Total | 15-25% | | Diversity ratio | Distribution across demographics | Even | | Interviewer satisfaction | Feedback on candidate quality | > 4/5 | | Time to schedule | Application to interview scheduled | < 48 hours | | Offer acceptance rate | Offers accepted / Total offers | > 80% | | Bias audit score | Statistical parity across groups | Within 4/5ths rule |
Complete Workflow#
async def process_application(
resume_file: str,
job_description: dict,
interviewer_emails: list
) -> dict:
"""End-to-end recruitment workflow."""
# Step 1: Parse resume
resume_text = extract_resume_text(resume_file)
parsed = parse_resume(resume_text)
# Step 2: Score candidate
score_result = score_candidate(parsed, job_description)
# Step 3: Route based on score
if score_result["recommendation"] == "advance":
# Get interviewer availability
slots = get_interviewer_availability(interviewer_emails[0])
# Send scheduling email
send_scheduling_email(
candidate_email=parsed["email"],
candidate_name=parsed["name"],
role_title=job_description["title"],
available_slots=slots[:5],
interview_type="video"
)
return {
"status": "interview_scheduled",
"score": score_result,
"candidate": parsed["name"]
}
elif score_result["recommendation"] == "talent_pool":
# Add to talent pool for future roles
return {
"status": "talent_pool",
"score": score_result,
"candidate": parsed["name"]
}
else:
# Send rejection email
generate_rejection_email(
parsed["name"],
job_description["title"]
)
return {
"status": "rejected",
"score": score_result,
"candidate": parsed["name"]
}
Common Mistakes to Avoid#
- No bias testing: Always audit your scoring for adverse impact — it's both ethical and legal
- Replacing human judgment entirely: AI should screen and rank, humans should make final decisions
- Ignoring candidate experience: AI rejection emails should still feel personal and respectful
- Using university prestige as a signal: This strongly correlates with socioeconomic background, not ability
- No transparency: Many jurisdictions require disclosing AI use in hiring — always inform candidates
Next Steps#
- AI Agent for Sales Automation — similar automation patterns for sales
- AI Agent for Customer Service — apply AI agents to support
- Multi-Agent Systems Guide — build a multi-agent recruitment pipeline
Frequently Asked Questions#
Is it legal to use AI for resume screening?#
It depends on jurisdiction. In the US, NYC Local Law 144 requires bias audits for automated employment tools. Illinois requires candidate notification. The EU AI Act classifies recruitment AI as "high-risk" requiring conformity assessments. Always consult your legal team and stay current on regulations in your operating regions.
How do I ensure the AI doesn't discriminate?#
Three measures: (1) Remove protected class indicators before scoring (names, photos, ages); (2) Run regular bias audits comparing score distributions across demographics; (3) Apply the EEOC's 4/5ths rule — if the selection rate for any group is less than 80% of the highest-scoring group, investigate and correct.
Should candidates know AI is screening their resume?#
Yes. Beyond legal requirements, transparency builds trust. Include a brief notice in your job posting or application confirmation: "We use AI-assisted tools to evaluate applications. All final hiring decisions are made by our recruiting team."
What's the ROI of an AI recruitment agent?#
Companies report 70-80% reduction in time-to-screen, 50% reduction in cost-per-hire, and 30% improvement in candidate quality (measured by 90-day retention). The biggest win is often freeing recruiters to spend more time on candidate experience and relationship building — which improves offer acceptance rates.