Topic Classification
Topic classification is the core of AudienceGPT. When you describe an audience segment in natural language, the platform runs it through a 7-layer classification engine that produces a structured taxonomy record. This record includes intent type, intensity, awareness stage, segment type, sensitivity flags, buyer journey position, and a composite score -- along with platform-ready segment names for major DSPs.
This guide covers both classification modes, the full conversation flow, each of the 7 layers in detail, DSP segment naming, reclassification, and duplicate detection.
Classification Modes
AudienceGPT offers two classification modes. You can choose between them depending on your needs for accuracy, speed, and cost.
AI-Powered Classification
The primary classification mode uses Claude Sonnet 4.6 with structured outputs to guarantee valid JSON responses. This mode provides:
- Web search verification -- The AI can perform up to 3 web searches per classification to verify brand identity, product details, and company information. This prevents misclassification of ambiguous topics (e.g., "Allbirds" correctly identified as a sustainable footwear brand rather than a bird-watching service).
- Contextual understanding -- The AI interprets nuanced descriptions, industry jargon, and brand names that rule-based systems would miss.
- Higher accuracy -- Especially for novel topics, niche brands, and ambiguous terms.
Web search is optional and can be disabled to reduce classification cost by approximately 50%. When disabled, the AI relies solely on its training data.
Rule-Based Classification
The local fallback mode uses deterministic regex-based classification that runs entirely on the client. This mode:
- Produces results instantly with no API call
- Costs nothing (no token usage)
- Works offline or when the API is unavailable
- Uses the same 7-layer output structure as AI classification
Rule-based classification is best for well-known categories where the topic name clearly indicates its taxonomy (e.g., "Toyota Camry" is unambiguously automotive). For ambiguous or novel topics, AI-powered mode is recommended.
When importing large batches via CSV, you can choose rule-based mode to classify thousands of topics at no cost, then selectively reclassify ambiguous ones with AI afterward.
Conversation Flow
Classification follows a structured conversation flow with distinct phases. The chat interface guides you through each step.
Flow Phases
input → gathering → confirm → classifying → review → result
| Phase | What Happens |
|---|---|
| Input | You enter a topic name or description in the chat |
| Gathering | The AI asks follow-up questions to collect context (keywords, category hints, segment type) |
| Confirm | The AI presents a summary of the topic details for your approval |
| Classifying | The 7-layer engine processes the topic (a few seconds for AI mode, instant for rule-based) |
| Review | Classification results are displayed for your review |
| Result | You can add the topic to your library, adjust details, or start a new classification |
Gathering Phase Details
During the gathering phase, the AI may ask about:
- Additional keywords -- Synonyms, related terms, or specific product/service names that strengthen the classification signal
- Category context -- Industry vertical, whether the topic is B2B or B2C, the intended audience
- Disambiguation -- If the topic name is ambiguous (e.g., "Mercury" could be automotive, technology, or financial services), the AI will ask for clarification
You can skip gathering by providing detailed context upfront. For example, instead of just typing "Salesforce", you could type "Salesforce CRM platform for enterprise sales teams" and the AI may proceed directly to confirmation.
Web Search Verification
When AI-powered mode is active and web search is enabled, the AI can search the web to verify facts about your topic before classifying. This is particularly valuable for:
- Brand identification -- Confirming what a company actually sells
- Product categorization -- Distinguishing between similarly named products in different industries
- Recency -- Catching recent pivots, acquisitions, or product launches
Web search results are used internally by the AI and do not appear directly in the chat. The classification result reflects the verified information. Citation markup from web search results is stripped before the response is returned.
The 7 Classification Layers
Each classified topic receives scores and labels across all 7 layers. Here is a detailed breakdown of each.
Layer 1: Intent Type
Identifies the nature of the audience's interest in the topic. The engine scores the topic name, category, and keywords against weighted regex patterns and returns a ranked list of intent types.
| Intent Type | Description | Example |
|---|---|---|
| Brand | Researching a specific brand or product line | "Nike", "Salesforce" |
| Product | Interest in tangible products or hardware/software | "iPhone 16", "Ring Doorbell" |
| Service | Professional services, agencies, or providers | "Tax Preparation", "HVAC Repair" |
| Solution | Business problems solved by an offering | "CRM Platform", "Supply Chain Optimization" |
| Function | Technical concepts, frameworks, or capabilities | "Machine Learning", "API Integration" |
| Symptom | Problem recognition and pain points | "Slow Website Performance", "High Employee Turnover" |
| Side Effect | Secondary consequences and risks | "Data Breach Impact", "Medication Side Effects" |
| Event | Conferences, summits, or flagship events | "AWS re:Invent", "CES 2026" |
Each topic receives a primary intent type (the strongest signal) and optionally a secondary intent type. Both are scored numerically.
Layer 2: Intensity
Measures how strong the behavioral signal is for this audience interest. Intensity is determined by keyword-based scoring against patterns associated with each level.
| Level | Score | Description |
|---|---|---|
| Dormant | 0 | No active signals detected |
| Passive | 15 | Background-level interest, minimal engagement |
| Curious | 30 | Light research, casual browsing |
| Active | 50 | Regular engagement with topic-related content |
| Engaged | 70 | Sustained, repeated interaction over time |
| Urgent | 85 | Time-sensitive need or strong buying signals |
| Critical | 95 | Immediate action required, highest priority |
Layer 3: Awareness Stage
Maps the audience to one of 5 stages in the Eugene Schwartz awareness model, adapted for digital advertising. This tells you where the audience is in their journey from complete unawareness to post-purchase loyalty.
| Stage | Funnel Position | Description |
|---|---|---|
| Unaware | Pre-Funnel | The audience does not know they have a need |
| Awareness | Top of Funnel (TOFU) | They recognize the problem or category exists |
| Consideration | Middle of Funnel (MOFU) | Actively comparing options and solutions |
| Decision | Bottom of Funnel (BOFU) | Ready to choose, evaluating specific providers |
| Retention | Post-Purchase | Existing customers, loyalty and upsell audiences |
The awareness stage is derived from the intent type classification. For example, "symptom" intents typically map to the Awareness stage, while "brand" intents with purchase keywords map to Decision.
Layer 4: Segment Type
Determines the business model context of the audience. This affects how the topic is categorized and which DSP configurations are appropriate.
| Segment | Description |
|---|---|
| B2C | Business-to-Consumer -- targeting individual consumers |
| B2B | Business-to-Business -- targeting companies or professional buyers |
| B2B2C | Business-to-Business-to-Consumer -- intermediary model |
| B2E | Business-to-Employee -- targeting workforce audiences |
| B2G | Business-to-Government -- targeting government entities |
Segment type is determined by taxonomy lookup first (each of the 41 parent categories has a default segment type), then refined by keyword analysis if needed.
Segment type determination is strict by design. If the system cannot confidently assign a segment type, it will flag the topic for administrator review rather than guess incorrectly.
Layer 5: Sensitivity
Flags whether the topic falls under regulated or sensitive categories that require special handling in advertising platforms. Sensitive topics are subject to additional compliance rules on most DSPs.
| Classification | Description | Examples |
|---|---|---|
| Standard | No special restrictions | "Toyota RAV4", "Kitchen Remodeling" |
| Sensitive | Regulated category, may have platform restrictions | Cannabis, Gambling, Alcohol, Pharmaceutical products |
Sensitivity is detected based on the parent category assignment. The following parent categories are automatically flagged as sensitive:
- Cannabis -- Dispensaries, CBD, cultivation
- Gambling & Casino -- Online betting, sportsbooks, casinos
- Alcohol & Spirits -- Beer, wine, spirits (age-gated)
- Health & Wellness -- Pharmaceutical products, medical devices (context-dependent)
Layer 6: Buyer Journey
Evaluates the purchase readiness of the audience, providing a more granular view than the awareness stage alone.
| Stage | Description | Score Range |
|---|---|---|
| Purchase Ready | Showing clear buying signals, ready to convert | 70--100 |
| Active Evaluation | Comparing specific products/vendors, requesting demos | 40--69 |
| Research Discovery | Early-stage research, gathering information | 0--39 |
Each buyer journey stage includes a funnel position label and a descriptive action statement (e.g., "Prioritize for retargeting" for Purchase Ready).
Layer 7: Composite Score
A single 0--100 score that synthesizes signals from all other layers into one actionable number. The composite score drives the interpretation label:
| Score | Label | Recommended Action |
|---|---|---|
| 80--100 | Hot Lead | Prioritize for direct response and retargeting campaigns |
| 60--79 | Warm Prospect | Nurture with consideration-stage content and offers |
| 40--59 | Active Researcher | Engage with educational content and comparisons |
| 20--39 | Early Explorer | Build awareness with top-of-funnel content |
| 0--19 | Cold Audience | Long-term brand awareness, broad reach campaigns |
DSP Segment Names
For each classified topic, AudienceGPT generates platform-ready segment names formatted for major DSP platforms. These names follow the hierarchical naming conventions required by each platform.
Platform Formats
AudienceGPT generates names for three built-in platform formats, plus any custom output templates configured by your administrator:
| Platform | Format Pattern | Character Limit |
|---|---|---|
| Trade Desk (Koa) | Taxonomy Type > Parent Category > Subcategory > Topic Name | Description: 256 chars |
| LiveRamp | Taxonomy Type > Parent Category > Subcategory > Topic Name | Description: 256 chars |
| Internal | Taxonomy Type > Parent Category > Subcategory > Topic Name | No limit |
Each platform name is generated from the same classification data but formatted according to that platform's conventions. The names include the full taxonomy path and may include additional context like segment type, intensity, or user behavior labels.
DSP Names Tab
In the Library's topic detail panel, the DSP Names tab shows all generated platform names for a topic. These are the exact strings that will be used when activating segments through platform connections.
If your organization uses custom output templates (configured by an administrator), additional platform name formats will appear alongside the built-in ones. Output templates support {{field}} placeholders for dynamic content.
Taxonomy Hierarchy
Every classified topic is placed within a 4-level taxonomy hierarchy:
Taxonomy Type (13 groups)
└── Parent Category (41 types)
└── Subcategory (tree nodes with L0/L1/L2 levels)
└── Topic
The 13 Taxonomy Types
| Taxonomy Type | Parent Categories |
|---|---|
| Automotive & Vehicles | Auto, Recreational Vehicles |
| Home & Property | Real Estate, Home & Garden / Home Improvement, Home Services |
| Financial & Legal | Financial Services, Insurance, Legal Services |
| Technology & Telecom | Business Technology, Technographics, Telecommunications, Consumer Electronics, Consumer Technology |
| Consumer Goods & Retail | Consumer Goods, Food & Beverage, Apparel & Accessories, Beauty & Personal Care, Babies & Children, Pets & Animals |
| Health | Health & Wellness |
| Education | Education |
| Travel & Hospitality | Travel & Leisure |
| Entertainment & Media | Entertainment, Sports & Fitness, Video Gaming, Gambling & Casino, News & Media |
| Lifestyle & Special Interest | Alcohol & Spirits, Cannabis, Gifting & Occasions, Sustainability & Green Living, Luxury & Premium |
| Civic & Cause | Charities & Nonprofits, Politics, Spiritual & Religion |
| B2B & Industrial | Business & Professional Services, Agriculture & Farming, Transportation & Logistics, Energy & Utilities, Government & Public Sector |
| Cross-Cutting | Life-Stage (Inferred) |
Each of the 41 parent categories has associated metadata including an IAB content taxonomy code, audience type, domain signals, and example topics that aid classification.
Reclassifying Topics
Topics can be reclassified when the classification engine is updated or when you want to re-evaluate a topic with a different mode.
When to Reclassify
- Engine version update -- When AudienceGPT releases an engine update that changes classification logic, existing topics are marked as "outdated." You can reclassify them to get results from the current engine.
- Mode switch -- A topic originally classified with rule-based mode can be reclassified with AI-powered mode for potentially better accuracy.
- Context changes -- If a brand pivots its business model or a product category evolves, reclassification captures the updated reality.
How to Reclassify
Single topic: Open the topic in the Library detail panel. If the engine version is outdated, a reclassify banner appears. Click "Reclassify" and choose your mode:
- AI-Powered -- Uses Claude Sonnet 4.6 with optional web search. Higher accuracy, consumes API credits.
- Rule-Based -- Instant local classification. No cost, but may be less accurate for ambiguous topics.
Bulk reclassify: Select multiple topics in the Library table using checkboxes, then click "Reclassify Selected" in the bulk action bar. Choose your mode in the modal. Limits:
- Rule-based: up to 500 topics per batch
- AI-powered: up to 100 topics per batch
AI-powered reclassification is quota-checked and usage-tracked. The reclassify modal shows an estimated cost before you confirm.
Engine Versioning
Every topic is stamped with the engine version at classification time. When classification logic changes (keyword patterns, layer functions, taxonomy definitions, or the AI prompt), the engine version is incremented. Topics classified with older versions are flagged as "outdated" in the Library.
You can filter the Library by engine version status (Current or Outdated) to quickly find topics that need reclassification.
Duplicate Detection
AudienceGPT uses a dual-layer duplicate detection system to prevent redundant topics in your library:
-
Semantic similarity -- Each topic is converted to a 256-dimensional embedding vector. Topics with a cosine similarity above 95% are blocked as duplicates. Topics between 75% and 95% similarity trigger a warning with the option to proceed.
-
Brand alias matching -- A dictionary of known brand aliases catches deterministic duplicates that embeddings might miss (e.g., "Chevy" and "Chevrolet").
When a duplicate is detected during classification, you will see a warning with the matching topic's name and similarity score. You can choose to proceed (creating a near-duplicate) or cancel and use the existing topic instead.
Duplicate detection also runs during CSV imports and sync operations, automatically enriching existing topics with new metadata rather than creating redundant entries.
Next Steps
- Library Management -- Learn how to browse, filter, and manage your classified topics
- Campaign Brief Analysis -- Upload campaign briefs for AI-recommended topics
- Matrix Generation -- Create combinatorial taxonomies at scale
- CSV Import -- Bulk import and classify topics from spreadsheets