a computer screen with a bunch of data on it — Photo by 1981 Digital on Unsplash

Google Analytics 4 holds the behavioral data of every user on your website — but most teams only review it reactively, when something has already gone wrong. AI agents connected to GA4 change this dynamic entirely: instead of scheduled dashboard reviews, you get proactive anomaly detection, automated reporting, and natural language answers to business questions directly from your live analytics data.

For marketing teams, product managers, and growth engineers responsible for traffic and conversion metrics, Google Analytics AI integration makes data-driven decisions faster and more accessible across the entire organization.

What AI Agents Can Do With Google Analytics Access#

Traffic Intelligence

Generate daily and weekly traffic summaries with automatic period-over-period comparisons
Detect sudden drops in sessions, users, or conversions before stakeholders notice
Identify which pages gained or lost the most traffic following a content or code change
Surface the top acquisition channels driving qualified traffic in the current period

Conversion and Funnel Analysis

Map drop-off points in checkout, signup, or any multi-step conversion sequence
Compare conversion rates across traffic sources, devices, and landing pages
Alert when a key conversion event stops firing — catching broken funnels immediately
Identify pages with high traffic but low conversion rates for optimization targeting

Automated Reporting

Send Monday morning traffic digests to Slack without manual report creation
Generate executive summaries comparing this month to the prior quarter
Track campaign performance automatically as new UTM-tagged traffic arrives
Summarize geographic or device-based traffic shifts in plain language

Setting Up Google Analytics Data API Access#

pip install google-analytics-data langchain langchain-openai python-dotenv

Enable the API and Authenticate#

Go to Google Cloud Console → APIs & Services → Enable APIs
Search "Google Analytics Data API" and enable it
Go to IAM & Admin → Service Accounts → Create service account → download the JSON key
In Google Analytics → Admin → Property Access Management → add the service account email as Viewer

export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account-key.json"
export GA4_PROPERTY_ID="123456789"  # Numbers only — found in GA4 Admin → Property Details

Test your connection:

from google.analytics.data_v1beta import BetaAnalyticsDataClient
from google.analytics.data_v1beta.types import DateRange, Dimension, Metric, RunReportRequest
import os

PROPERTY_ID = os.getenv("GA4_PROPERTY_ID")
client = BetaAnalyticsDataClient()

request = RunReportRequest(
    property=f"properties/{PROPERTY_ID}",
    dimensions=[Dimension(name="date")],
    metrics=[Metric(name="sessions")],
    date_ranges=[DateRange(start_date="7daysAgo", end_date="yesterday")]
)
response = client.run_report(request)
print(f"Connected — {response.row_count} days of data returned")

Option 1: No-Code with n8n#

Automated Weekly Analytics Report#

Schedule Trigger: Monday 8am
Google Analytics node (n8n built-in): Fetch sessions, active users, and conversions for the past 7 days vs. the prior 7 days
Code node: Calculate week-over-week percentage changes for each metric
OpenAI: "Write a 5-bullet weekly website performance summary. Highlight significant changes. Suggest one action based on the data."
Slack: Post to #marketing-metrics channel

n8n's Google Analytics node handles GA4 OAuth authentication and dimension/metric queries with simple field mapping — no custom code needed for most reporting workflows.

Option 2: LangChain with Python#

Build Google Analytics Tools#

import os
from google.analytics.data_v1beta import BetaAnalyticsDataClient
from google.analytics.data_v1beta.types import (
    DateRange, Dimension, Metric, RunReportRequest, RunRealtimeReportRequest
)
from langchain.tools import tool
from dotenv import load_dotenv

load_dotenv()

PROPERTY_ID = os.getenv("GA4_PROPERTY_ID")
client = BetaAnalyticsDataClient()


def run_ga_report(dimensions: list, metrics: list,
                  start_date: str = "7daysAgo",
                  end_date: str = "yesterday",
                  limit: int = 100) -> list:
    """Run a GA4 Data API report and return rows as list of dicts."""
    request = RunReportRequest(
        property=f"properties/{PROPERTY_ID}",
        dimensions=[Dimension(name=d) for d in dimensions],
        metrics=[Metric(name=m) for m in metrics],
        date_ranges=[DateRange(start_date=start_date, end_date=end_date)],
        limit=limit
    )
    response = client.run_report(request)
    rows = []
    for row in response.rows:
        row_dict = {}
        for i, dim in enumerate(dimensions):
            row_dict[dim] = row.dimension_values[i].value
        for i, met in enumerate(metrics):
            row_dict[met] = row.metric_values[i].value
        rows.append(row_dict)
    return rows


@tool
def get_traffic_summary(days: int = 7) -> str:
    """Get a traffic summary for the past N days including sessions, users, pageviews, and bounce rate."""
    rows = run_ga_report(
        dimensions=["date"],
        metrics=["sessions", "activeUsers", "screenPageViews", "bounceRate"],
        start_date=f"{days}daysAgo"
    )
    if not rows:
        return "No traffic data available"

    total_sessions = sum(int(r["sessions"]) for r in rows)
    total_users = sum(int(r["activeUsers"]) for r in rows)
    total_pageviews = sum(int(r["screenPageViews"]) for r in rows)
    avg_bounce = sum(float(r["bounceRate"]) for r in rows) / len(rows) if rows else 0

    return (f"Traffic summary (last {days} days):\n"
            f"Sessions: {total_sessions:,}\n"
            f"Active Users: {total_users:,}\n"
            f"Pageviews: {total_pageviews:,}\n"
            f"Avg Bounce Rate: {avg_bounce:.1%}")


@tool
def get_top_pages(days: int = 7, limit: int = 10) -> str:
    """Get top pages by pageviews for the past N days."""
    rows = run_ga_report(
        dimensions=["pagePath", "pageTitle"],
        metrics=["screenPageViews", "activeUsers", "averageSessionDuration"],
        start_date=f"{days}daysAgo",
        limit=limit
    )
    if not rows:
        return "No page data available"

    lines = [f"Top {limit} pages (last {days} days):"]
    for i, row in enumerate(rows, 1):
        views = int(row["screenPageViews"])
        users = int(row["activeUsers"])
        path = row["pagePath"][:60]
        lines.append(f"  {i}. {path} | {views:,} views | {users:,} users")
    return "\n".join(lines)


@tool
def get_traffic_by_channel(days: int = 30) -> str:
    """Get sessions broken down by marketing channel (Organic Search, Direct, Referral, Paid, etc.)."""
    rows = run_ga_report(
        dimensions=["sessionDefaultChannelGroup"],
        metrics=["sessions", "activeUsers", "conversions"],
        start_date=f"{days}daysAgo"
    )
    if not rows:
        return "No channel data available"

    total_sessions = sum(int(r["sessions"]) for r in rows)
    lines = [f"Traffic by channel (last {days} days):"]
    for row in sorted(rows, key=lambda x: int(x["sessions"]), reverse=True):
        sessions = int(row["sessions"])
        pct = sessions / total_sessions * 100 if total_sessions else 0
        conversions = int(row["conversions"])
        channel = row["sessionDefaultChannelGroup"]
        lines.append(f"  {channel}: {sessions:,} sessions ({pct:.1f}%) | {conversions:,} conversions")
    return "\n".join(lines)


@tool
def detect_traffic_anomaly(threshold_pct: float = 20.0) -> str:
    """
    Compare yesterday's traffic to the prior 7-day average to detect anomalies.
    threshold_pct: percentage deviation to flag as anomalous (default 20%).
    """
    yesterday_rows = run_ga_report(
        dimensions=["date"],
        metrics=["sessions", "activeUsers"],
        start_date="yesterday", end_date="yesterday"
    )
    baseline_rows = run_ga_report(
        dimensions=["date"],
        metrics=["sessions", "activeUsers"],
        start_date="8daysAgo", end_date="2daysAgo"
    )
    if not yesterday_rows or not baseline_rows:
        return "Insufficient data for anomaly detection"

    yesterday_sessions = int(yesterday_rows[0]["sessions"])
    baseline_avg = sum(int(r["sessions"]) for r in baseline_rows) / len(baseline_rows)
    deviation = ((yesterday_sessions - baseline_avg) / baseline_avg * 100) if baseline_avg else 0
    status = "ANOMALY DETECTED" if abs(deviation) > threshold_pct else "NORMAL"

    return (f"Traffic anomaly check:\n"
            f"Yesterday: {yesterday_sessions:,} sessions\n"
            f"7-day baseline avg: {baseline_avg:,.0f} sessions\n"
            f"Deviation: {deviation:+.1f}%\n"
            f"Status: {status}")


@tool
def get_realtime_users() -> str:
    """Get the number of users active on the site in the last 30 minutes by page."""
    request = RunRealtimeReportRequest(
        property=f"properties/{PROPERTY_ID}",
        dimensions=[Dimension(name="unifiedScreenName")],
        metrics=[Metric(name="activeUsers")],
        limit=10
    )
    response = client.run_realtime_report(request)
    total = sum(int(row.metric_values[0].value) for row in response.rows)
    lines = [f"Active users right now: {total}"]
    for row in response.rows[:5]:
        page = row.dimension_values[0].value
        users = int(row.metric_values[0].value)
        lines.append(f"  {page}: {users} users")
    return "\n".join(lines)

Google Analytics Agent#

from langchain_openai import ChatOpenAI
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain_core.prompts import ChatPromptTemplate

llm = ChatOpenAI(model="gpt-4o", temperature=0)
tools = [get_traffic_summary, get_top_pages, get_traffic_by_channel,
         detect_traffic_anomaly, get_realtime_users]

prompt = ChatPromptTemplate.from_messages([
    ("system", f"""You are a web analytics assistant with access to Google Analytics 4 (Property: {PROPERTY_ID}).

When answering analytics questions:
1. Always specify the time period in your answer
2. Compare metrics to prior periods when possible to show trends
3. Translate raw numbers into business insights — not just "sessions increased 15%" but what that means
4. Flag any anomalies or unexpected patterns proactively
5. Suggest one actionable next step based on the data

Common GA4 dimensions: date, pagePath, sessionDefaultChannelGroup, deviceCategory, country
Common GA4 metrics: sessions, activeUsers, screenPageViews, bounceRate, conversions"""),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])

agent = create_tool_calling_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools, verbose=True, max_iterations=6)

# Morning analytics brief
result = executor.invoke({
    "input": "Run my morning analytics check: yesterday's traffic vs the prior week, any anomalies, and our top 5 pages."
})
print(result["output"])

Rate Limits and Best Practices#

GA4 Data API limit	Value
Tokens per project per day	200,000
Concurrent requests	10
Max rows per response	250,000
Realtime data delay	~30 minutes

Best practices:

Cache daily summaries: Store yesterday's traffic summary once so the agent doesn't re-query on every conversation turn
Use NdaysAgo notation: Simpler and more reliable than calculating exact ISO dates — 7daysAgo, yesterday, today
Limit dimensions per query: Each additional dimension increases token cost — request only what you need
Handle sampling warnings: Check response.metadata.sampling_metadatas and note sampled data in agent output for transparency

Next Steps#

AI Agents BigQuery Integration — GA4 exports natively to BigQuery for deeper SQL-based analysis
AI Agents Slack Integration — Send GA4 anomaly alerts and weekly digests to Slack automatically
AI Agents Gmail Integration — Email daily analytics digests to stakeholders
Build an AI Agent with LangChain — Complete agent framework tutorial

What AI Agents Can Do With Google Analytics Access#

Traffic Intelligence

Generate daily and weekly traffic summaries with automatic period-over-period comparisons
Detect sudden drops in sessions, users, or conversions before stakeholders notice
Identify which pages gained or lost the most traffic following a content or code change
Surface the top acquisition channels driving qualified traffic in the current period

Conversion and Funnel Analysis

Map drop-off points in checkout, signup, or any multi-step conversion sequence
Compare conversion rates across traffic sources, devices, and landing pages
Alert when a key conversion event stops firing — catching broken funnels immediately
Identify pages with high traffic but low conversion rates for optimization targeting

Automated Reporting

Send Monday morning traffic digests to Slack without manual report creation
Generate executive summaries comparing this month to the prior quarter
Track campaign performance automatically as new UTM-tagged traffic arrives
Summarize geographic or device-based traffic shifts in plain language

Setting Up Google Analytics Data API Access#

pip install google-analytics-data langchain langchain-openai python-dotenv

Enable the API and Authenticate#

Go to Google Cloud Console → APIs & Services → Enable APIs
Search "Google Analytics Data API" and enable it
Go to IAM & Admin → Service Accounts → Create service account → download the JSON key
In Google Analytics → Admin → Property Access Management → add the service account email as Viewer

export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account-key.json"
export GA4_PROPERTY_ID="123456789"  # Numbers only — found in GA4 Admin → Property Details

Test your connection:

from google.analytics.data_v1beta import BetaAnalyticsDataClient
from google.analytics.data_v1beta.types import DateRange, Dimension, Metric, RunReportRequest
import os

PROPERTY_ID = os.getenv("GA4_PROPERTY_ID")
client = BetaAnalyticsDataClient()

request = RunReportRequest(
    property=f"properties/{PROPERTY_ID}",
    dimensions=[Dimension(name="date")],
    metrics=[Metric(name="sessions")],
    date_ranges=[DateRange(start_date="7daysAgo", end_date="yesterday")]
)
response = client.run_report(request)
print(f"Connected — {response.row_count} days of data returned")

Option 1: No-Code with n8n#

Automated Weekly Analytics Report#

Schedule Trigger: Monday 8am
Google Analytics node (n8n built-in): Fetch sessions, active users, and conversions for the past 7 days vs. the prior 7 days
Code node: Calculate week-over-week percentage changes for each metric
OpenAI: "Write a 5-bullet weekly website performance summary. Highlight significant changes. Suggest one action based on the data."
Slack: Post to #marketing-metrics channel

n8n's Google Analytics node handles GA4 OAuth authentication and dimension/metric queries with simple field mapping — no custom code needed for most reporting workflows.

Option 2: LangChain with Python#

Build Google Analytics Tools#

import os
from google.analytics.data_v1beta import BetaAnalyticsDataClient
from google.analytics.data_v1beta.types import (
    DateRange, Dimension, Metric, RunReportRequest, RunRealtimeReportRequest
)
from langchain.tools import tool
from dotenv import load_dotenv

load_dotenv()

PROPERTY_ID = os.getenv("GA4_PROPERTY_ID")
client = BetaAnalyticsDataClient()


def run_ga_report(dimensions: list, metrics: list,
                  start_date: str = "7daysAgo",
                  end_date: str = "yesterday",
                  limit: int = 100) -> list:
    """Run a GA4 Data API report and return rows as list of dicts."""
    request = RunReportRequest(
        property=f"properties/{PROPERTY_ID}",
        dimensions=[Dimension(name=d) for d in dimensions],
        metrics=[Metric(name=m) for m in metrics],
        date_ranges=[DateRange(start_date=start_date, end_date=end_date)],
        limit=limit
    )
    response = client.run_report(request)
    rows = []
    for row in response.rows:
        row_dict = {}
        for i, dim in enumerate(dimensions):
            row_dict[dim] = row.dimension_values[i].value
        for i, met in enumerate(metrics):
            row_dict[met] = row.metric_values[i].value
        rows.append(row_dict)
    return rows


@tool
def get_traffic_summary(days: int = 7) -> str:
    """Get a traffic summary for the past N days including sessions, users, pageviews, and bounce rate."""
    rows = run_ga_report(
        dimensions=["date"],
        metrics=["sessions", "activeUsers", "screenPageViews", "bounceRate"],
        start_date=f"{days}daysAgo"
    )
    if not rows:
        return "No traffic data available"

    total_sessions = sum(int(r["sessions"]) for r in rows)
    total_users = sum(int(r["activeUsers"]) for r in rows)
    total_pageviews = sum(int(r["screenPageViews"]) for r in rows)
    avg_bounce = sum(float(r["bounceRate"]) for r in rows) / len(rows) if rows else 0

    return (f"Traffic summary (last {days} days):\n"
            f"Sessions: {total_sessions:,}\n"
            f"Active Users: {total_users:,}\n"
            f"Pageviews: {total_pageviews:,}\n"
            f"Avg Bounce Rate: {avg_bounce:.1%}")


@tool
def get_top_pages(days: int = 7, limit: int = 10) -> str:
    """Get top pages by pageviews for the past N days."""
    rows = run_ga_report(
        dimensions=["pagePath", "pageTitle"],
        metrics=["screenPageViews", "activeUsers", "averageSessionDuration"],
        start_date=f"{days}daysAgo",
        limit=limit
    )
    if not rows:
        return "No page data available"

    lines = [f"Top {limit} pages (last {days} days):"]
    for i, row in enumerate(rows, 1):
        views = int(row["screenPageViews"])
        users = int(row["activeUsers"])
        path = row["pagePath"][:60]
        lines.append(f"  {i}. {path} | {views:,} views | {users:,} users")
    return "\n".join(lines)


@tool
def get_traffic_by_channel(days: int = 30) -> str:
    """Get sessions broken down by marketing channel (Organic Search, Direct, Referral, Paid, etc.)."""
    rows = run_ga_report(
        dimensions=["sessionDefaultChannelGroup"],
        metrics=["sessions", "activeUsers", "conversions"],
        start_date=f"{days}daysAgo"
    )
    if not rows:
        return "No channel data available"

    total_sessions = sum(int(r["sessions"]) for r in rows)
    lines = [f"Traffic by channel (last {days} days):"]
    for row in sorted(rows, key=lambda x: int(x["sessions"]), reverse=True):
        sessions = int(row["sessions"])
        pct = sessions / total_sessions * 100 if total_sessions else 0
        conversions = int(row["conversions"])
        channel = row["sessionDefaultChannelGroup"]
        lines.append(f"  {channel}: {sessions:,} sessions ({pct:.1f}%) | {conversions:,} conversions")
    return "\n".join(lines)


@tool
def detect_traffic_anomaly(threshold_pct: float = 20.0) -> str:
    """
    Compare yesterday's traffic to the prior 7-day average to detect anomalies.
    threshold_pct: percentage deviation to flag as anomalous (default 20%).
    """
    yesterday_rows = run_ga_report(
        dimensions=["date"],
        metrics=["sessions", "activeUsers"],
        start_date="yesterday", end_date="yesterday"
    )
    baseline_rows = run_ga_report(
        dimensions=["date"],
        metrics=["sessions", "activeUsers"],
        start_date="8daysAgo", end_date="2daysAgo"
    )
    if not yesterday_rows or not baseline_rows:
        return "Insufficient data for anomaly detection"

    yesterday_sessions = int(yesterday_rows[0]["sessions"])
    baseline_avg = sum(int(r["sessions"]) for r in baseline_rows) / len(baseline_rows)
    deviation = ((yesterday_sessions - baseline_avg) / baseline_avg * 100) if baseline_avg else 0
    status = "ANOMALY DETECTED" if abs(deviation) > threshold_pct else "NORMAL"

    return (f"Traffic anomaly check:\n"
            f"Yesterday: {yesterday_sessions:,} sessions\n"
            f"7-day baseline avg: {baseline_avg:,.0f} sessions\n"
            f"Deviation: {deviation:+.1f}%\n"
            f"Status: {status}")


@tool
def get_realtime_users() -> str:
    """Get the number of users active on the site in the last 30 minutes by page."""
    request = RunRealtimeReportRequest(
        property=f"properties/{PROPERTY_ID}",
        dimensions=[Dimension(name="unifiedScreenName")],
        metrics=[Metric(name="activeUsers")],
        limit=10
    )
    response = client.run_realtime_report(request)
    total = sum(int(row.metric_values[0].value) for row in response.rows)
    lines = [f"Active users right now: {total}"]
    for row in response.rows[:5]:
        page = row.dimension_values[0].value
        users = int(row.metric_values[0].value)
        lines.append(f"  {page}: {users} users")
    return "\n".join(lines)

Google Analytics Agent#

from langchain_openai import ChatOpenAI
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain_core.prompts import ChatPromptTemplate

llm = ChatOpenAI(model="gpt-4o", temperature=0)
tools = [get_traffic_summary, get_top_pages, get_traffic_by_channel,
         detect_traffic_anomaly, get_realtime_users]

prompt = ChatPromptTemplate.from_messages([
    ("system", f"""You are a web analytics assistant with access to Google Analytics 4 (Property: {PROPERTY_ID}).

When answering analytics questions:
1. Always specify the time period in your answer
2. Compare metrics to prior periods when possible to show trends
3. Translate raw numbers into business insights — not just "sessions increased 15%" but what that means
4. Flag any anomalies or unexpected patterns proactively
5. Suggest one actionable next step based on the data

Common GA4 dimensions: date, pagePath, sessionDefaultChannelGroup, deviceCategory, country
Common GA4 metrics: sessions, activeUsers, screenPageViews, bounceRate, conversions"""),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])

agent = create_tool_calling_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools, verbose=True, max_iterations=6)

# Morning analytics brief
result = executor.invoke({
    "input": "Run my morning analytics check: yesterday's traffic vs the prior week, any anomalies, and our top 5 pages."
})
print(result["output"])

Rate Limits and Best Practices#

GA4 Data API limit	Value
Tokens per project per day	200,000
Concurrent requests	10
Max rows per response	250,000
Realtime data delay	~30 minutes

Best practices:

Cache daily summaries: Store yesterday's traffic summary once so the agent doesn't re-query on every conversation turn
Use NdaysAgo notation: Simpler and more reliable than calculating exact ISO dates — 7daysAgo, yesterday, today
Limit dimensions per query: Each additional dimension increases token cost — request only what you need
Handle sampling warnings: Check response.metadata.sampling_metadatas and note sampled data in agent output for transparency

Next Steps#

AI Agents BigQuery Integration — GA4 exports natively to BigQuery for deeper SQL-based analysis
AI Agents Slack Integration — Send GA4 anomaly alerts and weekly digests to Slack automatically
AI Agents Gmail Integration — Email daily analytics digests to stakeholders
Build an AI Agent with LangChain — Complete agent framework tutorial

AI Agents + Google Analytics: Setup Guide

What AI Agents Can Do With Google Analytics Access#

Setting Up Google Analytics Data API Access#

Enable the API and Authenticate#

Option 1: No-Code with n8n#

Automated Weekly Analytics Report#

Option 2: LangChain with Python#

Build Google Analytics Tools#

Google Analytics Agent#

Rate Limits and Best Practices#

Next Steps#

AI Agents + Google Analytics: Setup Guide

What AI Agents Can Do With Google Analytics Access#

Setting Up Google Analytics Data API Access#

Enable the API and Authenticate#

Option 1: No-Code with n8n#

Automated Weekly Analytics Report#

Option 2: LangChain with Python#

Build Google Analytics Tools#

Google Analytics Agent#

Rate Limits and Best Practices#

Next Steps#