Topic Classification

Topic classification is the core of AudienceGPT. When you describe an audience segment in natural language, the platform runs it through a 7-layer classification engine that produces a structured taxonomy record. This record includes intent type, intensity, awareness stage, segment type, sensitivity flags, buyer journey position, and a composite score -- along with platform-ready segment names for major DSPs.

This guide covers both classification modes, the full conversation flow, each of the 7 layers in detail, DSP segment naming, reclassification, and duplicate detection.

Classification Modes

AudienceGPT offers two classification modes. You can choose between them depending on your needs for accuracy, speed, and cost.

AI-Powered Classification

The primary classification mode uses Claude Sonnet 4.6 with structured outputs to guarantee valid JSON responses. This mode provides:

Web search verification -- The AI can perform up to 3 web searches per classification to verify brand identity, product details, and company information. This prevents misclassification of ambiguous topics (e.g., "Allbirds" correctly identified as a sustainable footwear brand rather than a bird-watching service).
Contextual understanding -- The AI interprets nuanced descriptions, industry jargon, and brand names that rule-based systems would miss.
Higher accuracy -- Especially for novel topics, niche brands, and ambiguous terms.

info

Web search is optional and can be disabled to reduce classification cost by approximately 50%. When disabled, the AI relies solely on its training data.

Rule-Based Classification

The local fallback mode uses deterministic regex-based classification that runs entirely on the client. This mode:

Produces results instantly with no API call
Costs nothing (no token usage)
Works offline or when the API is unavailable
Uses the same 7-layer output structure as AI classification

Rule-based classification is best for well-known categories where the topic name clearly indicates its taxonomy (e.g., "Toyota Camry" is unambiguously automotive). For ambiguous or novel topics, AI-powered mode is recommended.

tip

When importing large batches via CSV, you can choose rule-based mode to classify thousands of topics at no cost, then selectively reclassify ambiguous ones with AI afterward.

Conversation Flow

Classification follows a structured conversation flow with distinct phases. The chat interface guides you through each step.

Flow Phases

input → gathering → confirm → classifying → review → result

Phase	What Happens
Input	You enter a topic name or description in the chat
Gathering	The AI asks follow-up questions to collect context (keywords, category hints, segment type)
Confirm	The AI presents a summary of the topic details for your approval
Classifying	The 7-layer engine processes the topic (a few seconds for AI mode, instant for rule-based)
Review	Classification results are displayed for your review
Result	You can add the topic to your library, adjust details, or start a new classification

Gathering Phase Details

During the gathering phase, the AI may ask about:

Additional keywords -- Synonyms, related terms, or specific product/service names that strengthen the classification signal
Category context -- Industry vertical, whether the topic is B2B or B2C, the intended audience
Disambiguation -- If the topic name is ambiguous (e.g., "Mercury" could be automotive, technology, or financial services), the AI will ask for clarification

You can skip gathering by providing detailed context upfront. For example, instead of just typing "Salesforce", you could type "Salesforce CRM platform for enterprise sales teams" and the AI may proceed directly to confirmation.

Web Search Verification

When AI-powered mode is active and web search is enabled, the AI can search the web to verify facts about your topic before classifying. This is particularly valuable for:

Brand identification -- Confirming what a company actually sells
Product categorization -- Distinguishing between similarly named products in different industries
Recency -- Catching recent pivots, acquisitions, or product launches

Web search results are used internally by the AI and do not appear directly in the chat. The classification result reflects the verified information. Citation markup from web search results is stripped before the response is returned.

The 7 Classification Layers

Each classified topic receives scores and labels across all 7 layers. Here is a detailed breakdown of each.

Layer 1: Intent Type

Identifies the nature of the audience's interest in the topic. The engine scores the topic name, category, and keywords against weighted regex patterns and returns a ranked list of intent types.

Intent Type	Description	Example
Brand	Researching a specific brand or product line	"Nike", "Salesforce"
Product	Interest in tangible products or hardware/software	"iPhone 16", "Ring Doorbell"
Service	Professional services, agencies, or providers	"Tax Preparation", "HVAC Repair"
Solution	Business problems solved by an offering	"CRM Platform", "Supply Chain Optimization"
Function	Technical concepts, frameworks, or capabilities	"Machine Learning", "API Integration"
Symptom	Problem recognition and pain points	"Slow Website Performance", "High Employee Turnover"
Side Effect	Secondary consequences and risks	"Data Breach Impact", "Medication Side Effects"
Event	Conferences, summits, or flagship events	"AWS re:Invent", "CES 2026"

Each topic receives a primary intent type (the strongest signal) and optionally a secondary intent type. Both are scored numerically.

Layer 2: Intensity

Measures how strong the behavioral signal is for this audience interest. Intensity is determined by keyword-based scoring against patterns associated with each level.

Level	Score	Description
Dormant	0	No active signals detected
Passive	15	Background-level interest, minimal engagement
Curious	30	Light research, casual browsing
Active	50	Regular engagement with topic-related content
Engaged	70	Sustained, repeated interaction over time
Urgent	85	Time-sensitive need or strong buying signals
Critical	95	Immediate action required, highest priority

Layer 3: Awareness Stage

Maps the audience to one of 5 stages in the Eugene Schwartz awareness model, adapted for digital advertising. This tells you where the audience is in their journey from complete unawareness to post-purchase loyalty.

Stage	Funnel Position	Description
Unaware	Pre-Funnel	The audience does not know they have a need
Awareness	Top of Funnel (TOFU)	They recognize the problem or category exists
Consideration	Middle of Funnel (MOFU)	Actively comparing options and solutions
Decision	Bottom of Funnel (BOFU)	Ready to choose, evaluating specific providers
Retention	Post-Purchase	Existing customers, loyalty and upsell audiences

The awareness stage is derived from the intent type classification. For example, "symptom" intents typically map to the Awareness stage, while "brand" intents with purchase keywords map to Decision.

Layer 4: Segment Type

Determines the business model context of the audience. This affects how the topic is categorized and which DSP configurations are appropriate.

Segment	Description
B2C	Business-to-Consumer -- targeting individual consumers
B2B	Business-to-Business -- targeting companies or professional buyers
B2B2C	Business-to-Business-to-Consumer -- intermediary model
B2E	Business-to-Employee -- targeting workforce audiences
B2G	Business-to-Government -- targeting government entities

Segment type is determined by taxonomy lookup first (each of the 41 parent categories has a default segment type), then refined by keyword analysis if needed.

warning

Segment type determination is strict by design. If the system cannot confidently assign a segment type, it will flag the topic for administrator review rather than guess incorrectly.

Layer 5: Sensitivity

Flags whether the topic falls under regulated or sensitive categories that require special handling in advertising platforms. Sensitive topics are subject to additional compliance rules on most DSPs.

Classification	Description	Examples
Standard	No special restrictions	"Toyota RAV4", "Kitchen Remodeling"
Sensitive	Regulated category, may have platform restrictions	Cannabis, Gambling, Alcohol, Pharmaceutical products

Sensitivity is detected based on the parent category assignment. The following parent categories are automatically flagged as sensitive:

Cannabis -- Dispensaries, CBD, cultivation
Gambling & Casino -- Online betting, sportsbooks, casinos
Alcohol & Spirits -- Beer, wine, spirits (age-gated)
Health & Wellness -- Pharmaceutical products, medical devices (context-dependent)

Layer 6: Buyer Journey

Evaluates the purchase readiness of the audience, providing a more granular view than the awareness stage alone.

Stage	Description	Score Range
Purchase Ready	Showing clear buying signals, ready to convert	70--100
Active Evaluation	Comparing specific products/vendors, requesting demos	40--69
Research Discovery	Early-stage research, gathering information	0--39

Each buyer journey stage includes a funnel position label and a descriptive action statement (e.g., "Prioritize for retargeting" for Purchase Ready).

Layer 7: Composite Score

A single 0--100 score that synthesizes signals from all other layers into one actionable number. The composite score drives the interpretation label:

Score	Label	Recommended Action
80--100	Hot Lead	Prioritize for direct response and retargeting campaigns
60--79	Warm Prospect	Nurture with consideration-stage content and offers
40--59	Active Researcher	Engage with educational content and comparisons
20--39	Early Explorer	Build awareness with top-of-funnel content
0--19	Cold Audience	Long-term brand awareness, broad reach campaigns

DSP Segment Names

For each classified topic, AudienceGPT generates platform-ready segment names formatted for major DSP platforms. These names follow the hierarchical naming conventions required by each platform.

Platform Formats

AudienceGPT generates names for three built-in platform formats, plus any custom output templates configured by your administrator:

Platform	Format Pattern	Character Limit
Trade Desk (Koa)	`Taxonomy Type > Parent Category > Subcategory > Topic Name`	Description: 256 chars
LiveRamp	`Taxonomy Type > Parent Category > Subcategory > Topic Name`	Description: 256 chars
Internal	`Taxonomy Type > Parent Category > Subcategory > Topic Name`	No limit

Each platform name is generated from the same classification data but formatted according to that platform's conventions. The names include the full taxonomy path and may include additional context like segment type, intensity, or user behavior labels.

DSP Names Tab

In the Library's topic detail panel, the DSP Names tab shows all generated platform names for a topic. These are the exact strings that will be used when activating segments through platform connections.

tip

If your organization uses custom output templates (configured by an administrator), additional platform name formats will appear alongside the built-in ones. Output templates support {{field}} placeholders for dynamic content.

Taxonomy Hierarchy

Every classified topic is placed within a 4-level taxonomy hierarchy:

Taxonomy Type (13 groups)
  └── Parent Category (41 types)
      └── Subcategory (tree nodes with L0/L1/L2 levels)
          └── Topic

The 13 Taxonomy Types

Taxonomy Type	Parent Categories
Automotive & Vehicles	Auto, Recreational Vehicles
Home & Property	Real Estate, Home & Garden / Home Improvement, Home Services
Financial & Legal	Financial Services, Insurance, Legal Services
Technology & Telecom	Business Technology, Technographics, Telecommunications, Consumer Electronics, Consumer Technology
Consumer Goods & Retail	Consumer Goods, Food & Beverage, Apparel & Accessories, Beauty & Personal Care, Babies & Children, Pets & Animals
Health	Health & Wellness
Education	Education
Travel & Hospitality	Travel & Leisure
Entertainment & Media	Entertainment, Sports & Fitness, Video Gaming, Gambling & Casino, News & Media
Lifestyle & Special Interest	Alcohol & Spirits, Cannabis, Gifting & Occasions, Sustainability & Green Living, Luxury & Premium
Civic & Cause	Charities & Nonprofits, Politics, Spiritual & Religion
B2B & Industrial	Business & Professional Services, Agriculture & Farming, Transportation & Logistics, Energy & Utilities, Government & Public Sector
Cross-Cutting	Life-Stage (Inferred)

Each of the 41 parent categories has associated metadata including an IAB content taxonomy code, audience type, domain signals, and example topics that aid classification.

Reclassifying Topics

Topics can be reclassified when the classification engine is updated or when you want to re-evaluate a topic with a different mode.

When to Reclassify

Engine version update -- When AudienceGPT releases an engine update that changes classification logic, existing topics are marked as "outdated." You can reclassify them to get results from the current engine.
Mode switch -- A topic originally classified with rule-based mode can be reclassified with AI-powered mode for potentially better accuracy.
Context changes -- If a brand pivots its business model or a product category evolves, reclassification captures the updated reality.

How to Reclassify

Single topic: Open the topic in the Library detail panel. If the engine version is outdated, a reclassify banner appears. Click "Reclassify" and choose your mode:

AI-Powered -- Uses Claude Sonnet 4.6 with optional web search. Higher accuracy, consumes API credits.
Rule-Based -- Instant local classification. No cost, but may be less accurate for ambiguous topics.

Bulk reclassify: Select multiple topics in the Library table using checkboxes, then click "Reclassify Selected" in the bulk action bar. Choose your mode in the modal. Limits:

Rule-based: up to 500 topics per batch
AI-powered: up to 100 topics per batch

info

AI-powered reclassification is quota-checked and usage-tracked. The reclassify modal shows an estimated cost before you confirm.

Engine Versioning

Every topic is stamped with the engine version at classification time. When classification logic changes (keyword patterns, layer functions, taxonomy definitions, or the AI prompt), the engine version is incremented. Topics classified with older versions are flagged as "outdated" in the Library.

You can filter the Library by engine version status (Current or Outdated) to quickly find topics that need reclassification.

Duplicate Detection

AudienceGPT uses a dual-layer duplicate detection system to prevent redundant topics in your library:

Semantic similarity -- Each topic is converted to a 256-dimensional embedding vector. Topics with a cosine similarity above 95% are blocked as duplicates. Topics between 75% and 95% similarity trigger a warning with the option to proceed.
Brand alias matching -- A dictionary of known brand aliases catches deterministic duplicates that embeddings might miss (e.g., "Chevy" and "Chevrolet").

When a duplicate is detected during classification, you will see a warning with the matching topic's name and similarity score. You can choose to proceed (creating a near-duplicate) or cancel and use the existing topic instead.

tip

Duplicate detection also runs during CSV imports and sync operations, automatically enriching existing topics with new metadata rather than creating redundant entries.

Next Steps

Library Management -- Learn how to browse, filter, and manage your classified topics
Campaign Brief Analysis -- Upload campaign briefs for AI-recommended topics
Matrix Generation -- Create combinatorial taxonomies at scale
CSV Import -- Bulk import and classify topics from spreadsheets

Classification Modes​

AI-Powered Classification​

Rule-Based Classification​

Conversation Flow​

Flow Phases​

Gathering Phase Details​

Web Search Verification​

The 7 Classification Layers​

Layer 1: Intent Type​

Layer 2: Intensity​

Layer 3: Awareness Stage​

Layer 4: Segment Type​

Layer 5: Sensitivity​

Layer 6: Buyer Journey​

Layer 7: Composite Score​

DSP Segment Names​

Platform Formats​

DSP Names Tab​

Taxonomy Hierarchy​

The 13 Taxonomy Types​

Reclassifying Topics​

When to Reclassify​

How to Reclassify​

Engine Versioning​

Duplicate Detection​

Next Steps​