Taxonomy Structure
AudienceGPT organizes audience segments into a multi-level hierarchy that flows from broad industry groups down to individual classified topics. This page documents the full hierarchy, lists every taxonomy type and parent category, explains the subcategory tree system, and provides a reference for the constants and lookup functions available in code.
Understanding this hierarchy is essential for working with classification results, building filters, and writing queries against the catalog.
Hierarchy Overview
The taxonomy reads top-down through six levels:
| Level | Name | Count | Source |
|---|---|---|---|
| 1 | Taxonomy Type | 13 | Hardcoded in TAXONOMY_TYPE_ORDER |
| 2 | Parent Category | 41 | Hardcoded in PARENT_CATEGORIES |
| 3 | Subcategory L0 | Variable | taxonomy_tree table (level = 0) |
| 4 | Subcategory L1 | Variable | taxonomy_tree table (level = 1) |
| 5 | Subcategory L2+ | Variable | taxonomy_tree table (level >= 2) |
| 6 | Topic | Unlimited | topics table |
Remember that what TypeScript calls taxonomy_type (13 groups) is stored in the DB column parent_category, and what TypeScript calls parent_category (41 types) is stored in the DB column taxonomy_type. See the Field Mapping page for details.
The 13 Taxonomy Types
Taxonomy types are the broadest grouping level. They are defined in TAXONOMY_TYPE_ORDER in src/lib/constants/taxonomy-types.ts and displayed in this fixed order:
| # | Taxonomy Type | Parent Categories |
|---|---|---|
| 1 | Automotive & Vehicles | Auto, Recreational Vehicles |
| 2 | Home & Property | Real Estate, Home & Garden / Home Improvement, Home Services |
| 3 | Financial & Legal | Financial Services, Insurance, Legal Services |
| 4 | Technology & Telecom | Business Technology, Technographics, Telecommunications, Consumer Electronics, Consumer Technology |
| 5 | Consumer Goods & Retail | Consumer Goods, Food & Beverage, Apparel & Accessories, Beauty & Personal Care, Babies & Children, Pets & Animals |
| 6 | Health | Health & Wellness |
| 7 | Education | Education |
| 8 | Travel & Hospitality | Travel & Leisure |
| 9 | Entertainment & Media | Entertainment, Sports & Fitness, Video Gaming, Gambling & Casino, News & Media |
| 10 | Lifestyle & Special Interest | Alcohol & Spirits, Cannabis, Gifting & Occasions, Sustainability & Green Living, Luxury & Premium |
| 11 | Civic & Cause | Charities & Nonprofits, Politics, Spiritual & Religion |
| 12 | B2B & Industrial | Business & Professional Services, Agriculture & Farming, Transportation & Logistics, Energy & Utilities, Government & Public Sector |
| 13 | Cross-Cutting | Life-Stage (Inferred) |
The 41 Parent Categories
Each parent category is a specialized classification domain with its own IAB code, audience type, domain signals, and example topics. They are defined as the PARENT_CATEGORIES record in src/lib/constants/taxonomy-types.ts.
| ID | Label | Taxonomy Type | Audience Type | IAB Code |
|---|---|---|---|---|
auto | Auto | Automotive & Vehicles | In-Market Buyers | IAB2 |
recreational_vehicles | Recreational Vehicles | Automotive & Vehicles | In-Market Buyers | IAB2 |
real_estate | Real Estate | Home & Property | In-Market Buyers | IAB21 |
home_garden | Home & Garden / Home Improvement | Home & Property | In-Market Buyers | IAB10 |
home_services | Home Services | Home & Property | In-Market Buyers | IAB10 |
financial_services | Financial Services | Financial & Legal | Financial Planners | IAB13 |
insurance | Insurance | Financial & Legal | Financial Planners | IAB13-7 |
legal_services | Legal Services | Financial & Legal | Financial Planners | IAB11 |
technology | Business Technology | Technology & Telecom | Business Decision Makers | IAB19 |
technographics | Technographics | Technology & Telecom | Technology Infrastructure Buyers | IAB19 |
telecommunications | Telecommunications | Technology & Telecom | In-Market Buyers | IAB19 |
consumer_electronics | Consumer Electronics | Technology & Telecom | In-Market Buyers | IAB19-6 |
consumer_technology | Consumer Technology | Technology & Telecom | In-Market Buyers | IAB19 |
consumer_goods | Consumer Goods | Consumer Goods & Retail | In-Market Buyers | IAB22 |
food_beverage | Food & Beverage | Consumer Goods & Retail | In-Market Buyers | IAB8 |
apparel_accessories | Apparel & Accessories | Consumer Goods & Retail | In-Market Buyers | IAB18 |
beauty_personal_care | Beauty & Personal Care | Consumer Goods & Retail | In-Market Buyers | IAB18 |
babies_children | Babies & Children | Consumer Goods & Retail | In-Market Buyers | IAB6 |
pets_animals | Pets & Animals | Consumer Goods & Retail | In-Market Buyers | IAB16 |
health_wellness | Health & Wellness | Health | Health-Conscious Consumers | IAB7 |
education | Education | Education | Learners & Students | IAB5 |
travel_leisure | Travel & Leisure | Travel & Hospitality | Active Travelers | IAB20 |
entertainment | Entertainment | Entertainment & Media | Active Enthusiasts | IAB1 |
sports_fitness | Sports & Fitness | Entertainment & Media | Active Enthusiasts | IAB17 |
video_gaming | Video Gaming | Entertainment & Media | Active Enthusiasts | IAB9 |
gambling_casino | Gambling & Casino | Entertainment & Media | Regulated Consumers | IAB9 |
news_media | News & Media | Entertainment & Media | Active Enthusiasts | IAB12 |
alcohol_spirits | Alcohol & Spirits | Lifestyle & Special Interest | Regulated Consumers | IAB8 |
cannabis | Cannabis | Lifestyle & Special Interest | Regulated Consumers | IAB7 |
gifting_occasions | Gifting & Occasions | Lifestyle & Special Interest | In-Market Buyers | IAB22 |
sustainability | Sustainability & Green Living | Lifestyle & Special Interest | Sustainability Advocates | IAB10 |
luxury | Luxury & Premium | Lifestyle & Special Interest | In-Market Buyers | IAB18 |
charities_nonprofits | Charities & Nonprofits | Civic & Cause | Active Supporters | IAB11 |
politics | Politics | Civic & Cause | Active Supporters | IAB11 |
spiritual_religion | Spiritual & Religion | Civic & Cause | Active Supporters | IAB23 |
business_professional_services | Business & Professional Services | B2B & Industrial | Business Decision Makers | IAB3 |
agriculture | Agriculture & Farming | B2B & Industrial | Agricultural Operators | IAB3 |
transportation_logistics | Transportation & Logistics | B2B & Industrial | Business Decision Makers | IAB3 |
energy_utilities | Energy & Utilities | B2B & Industrial | Energy & Utility Consumers | IAB3 |
government_public_sector | Government & Public Sector | B2B & Industrial | Business Decision Makers | IAB11 |
life_stage | Life-Stage (Inferred) | Cross-Cutting | Inferred Audiences | IAB6 |
Each parent category definition also includes:
description-- One-line summary of the domaindomain_signals-- Keywords and abbreviations that indicate relevance (used by the classification engine)example_topics-- Representative brands and topics for training and testing
Audience Types
The audience_type field is denormalized onto each topic at classification time, derived from the parent category definition. There are 14 distinct audience types:
| Audience Type | Used By |
|---|---|
| In-Market Buyers | Auto, Recreational Vehicles, Real Estate, Home & Garden, Home Services, Telecommunications, Consumer Electronics, Consumer Technology, Consumer Goods, Food & Beverage, Apparel & Accessories, Beauty & Personal Care, Babies & Children, Pets & Animals, Gifting & Occasions, Luxury & Premium |
| Financial Planners | Financial Services, Insurance, Legal Services |
| Business Decision Makers | Business Technology, Business & Professional Services, Transportation & Logistics, Government & Public Sector |
| Technology Infrastructure Buyers | Technographics |
| Health-Conscious Consumers | Health & Wellness |
| Learners & Students | Education |
| Active Travelers | Travel & Leisure |
| Active Enthusiasts | Entertainment, Sports & Fitness, Video Gaming, News & Media |
| Regulated Consumers | Gambling & Casino, Alcohol & Spirits, Cannabis |
| Sustainability Advocates | Sustainability & Green Living |
| Active Supporters | Charities & Nonprofits, Politics, Spiritual & Religion |
| Agricultural Operators | Agriculture & Farming |
| Energy & Utility Consumers | Energy & Utilities |
| Inferred Audiences | Life-Stage (Inferred) |
The library UI supports filtering by audience type. UCP signal generation uses the stored audience type with a runtime lookup fallback for backward compatibility.
IAB Content Taxonomy Codes
Each parent category maps to an IAB content taxonomy code for programmatic advertising compatibility. The codes follow the IAB Tech Lab Content Taxonomy standard:
| IAB Code | IAB Category | Parent Categories |
|---|---|---|
| IAB1 | Arts & Entertainment | Entertainment |
| IAB2 | Automotive | Auto, Recreational Vehicles |
| IAB3 | Business | Business & Professional Services, Agriculture, Transportation & Logistics, Energy & Utilities, Government & Public Sector |
| IAB5 | Education | Education |
| IAB6 | Family & Parenting | Babies & Children, Life-Stage (Inferred) |
| IAB7 | Health & Fitness | Health & Wellness, Cannabis |
| IAB8 | Food & Drink | Food & Beverage, Alcohol & Spirits |
| IAB9 | Hobbies & Interests | Video Gaming, Gambling & Casino |
| IAB10 | Home & Garden | Home & Garden, Home Services, Sustainability |
| IAB11 | Law, Government & Politics | Legal Services, Charities & Nonprofits, Politics, Government & Public Sector |
| IAB12 | News | News & Media |
| IAB13 | Personal Finance | Financial Services |
| IAB13-7 | Insurance | Insurance |
| IAB16 | Pets | Pets & Animals |
| IAB17 | Sports | Sports & Fitness |
| IAB18 | Style & Fashion | Apparel & Accessories, Beauty & Personal Care, Luxury & Premium |
| IAB19 | Technology & Computing | Business Technology, Technographics, Telecommunications, Consumer Technology |
| IAB19-6 | Consumer Electronics | Consumer Electronics |
| IAB20 | Travel | Travel & Leisure |
| IAB21 | Real Estate | Real Estate |
| IAB22 | Shopping | Consumer Goods, Gifting & Occasions |
| IAB23 | Religion & Spirituality | Spiritual & Religion |
The Subcategory Tree
Below the 41 parent categories, subcategories are organized in a hierarchical tree stored in the taxonomy_tree database table (migration 0031_taxonomy_tree.sql).
Table Schema
CREATE TABLE taxonomy_tree (
id TEXT PRIMARY KEY,
taxonomy_type TEXT NOT NULL, -- DB column: stores 41 parent category labels
parent_id TEXT REFERENCES taxonomy_tree(id),
label TEXT NOT NULL, -- Display name of this node
level SMALLINT NOT NULL, -- 0 = root, 1 = child, 2 = grandchild, ...
path TEXT NOT NULL, -- Materialized path: "L0 > L1 > L2"
sort_order INTEGER NOT NULL,
source TEXT NOT NULL DEFAULT 'seed',
is_active BOOLEAN NOT NULL DEFAULT true,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE(taxonomy_type, path)
);
The taxonomy_type column in the taxonomy_tree table follows the DB naming convention -- it stores the 41 parent category labels (e.g., "Auto", "Business Technology"), not the 13 group names. This is consistent with the topics table but opposite to the TS property name.
Adjacency List + Materialized Path
The tree uses a dual-navigation pattern:
- Adjacency list: Each node has a
parent_idpointing to its immediate parent. Root nodes haveparent_id = NULL. - Materialized path: The
pathcolumn stores the full path from root to the node, using>as the delimiter.
Level 0 path: "Electric Vehicles"
Level 1 path: "Electric Vehicles > Battery Technology"
Level 2 path: "Electric Vehicles > Battery Technology > Solid State"
This dual approach enables both recursive tree traversal (via parent_id) and fast path-based filtering (via LIKE on path).
Linking Topics to Tree Nodes
Topics connect to the tree via two columns on the topics table:
| Column | Type | Purpose |
|---|---|---|
taxonomy_node_id | TEXT FK | Foreign key to taxonomy_tree.id |
taxonomy_path | TEXT | Denormalized materialized path for fast filtering |
The taxonomy_path column enables multi-level subcategory filtering in the catalog API without requiring a JOIN:
-- Filter by L0 subcategory
WHERE SPLIT_PART(taxonomy_path, ' > ', 1) = 'Electric Vehicles'
-- Filter by L1 subcategory
WHERE SPLIT_PART(taxonomy_path, ' > ', 2) = 'Battery Technology'
-- Filter by L2 subcategory
WHERE SPLIT_PART(taxonomy_path, ' > ', 3) = 'Solid State'
Seeding and Maintenance
Tree nodes are populated by seed scripts and maintained by batch realignment:
| Script | Command | Purpose |
|---|---|---|
| Seed all trees | bun run scripts/seed-all-trees.sh | Run all seed-*-trees.ts scripts |
| Realign subcategories | bun run scripts/realign-subcategories.ts | LLM batch reclassify topic subcategories to tree node labels |
| Fix subcategory L0 | bun run scripts/fix-subcategory-level0.ts | Update subcategory display values to match tree L0 labels |
| Backfill nodes | bun run scripts/backfill-taxonomy-nodes-llm.ts | Backfill taxonomy_node_id via LLM batches |
The realignment script uses Claude Haiku in batches to match each topic's free-text subcategory to the closest tree node label, then sets the taxonomy_node_id foreign key.
Segment Types
Each classified topic is assigned a segment type indicating the business model orientation:
| Segment Type | Description |
|---|---|
| B2B | Business-to-Business |
| B2C | Business-to-Consumer |
| B2B2C | Business-to-Business-to-Consumer |
| B2E | Business-to-Employee |
| B2G | Business-to-Government |
Segment type is determined first by taxonomy lookup (certain parent categories default to B2B or B2C), then by keyword-based heuristics when the taxonomy lookup is ambiguous.
By design, there is no local fallback for segment_type when the classifier cannot determine it. The system errors intentionally to give the super admin the opportunity to train the system for that edge case.
Constants and Lookup Functions Reference
All taxonomy constants live in src/lib/constants/taxonomy-types.ts:
Data Structures
| Export | Type | Description |
|---|---|---|
PARENT_CATEGORIES | Record<string, TaxonomyType> | All 41 parent category definitions keyed by ID |
TAXONOMY_TYPE_ORDER | string[] | Ordered list of 13 taxonomy type group labels |
PARENT_CATEGORY_LIST | TaxonomyType[] | All 41 categories sorted by taxonomy type then label |
PARENT_CATEGORY_LABELS | string[] | Labels of all 41 categories in sorted order |
AUDIENCE_TYPE_LABELS | string[] | Distinct audience type labels (14 values) |
TAXONOMY_TYPE_MAP | Record<string, string> | Maps external names (Data Alliance verticals, technographics categories, etc.) to parent category IDs |
PARENT_CATEGORY_HASHES | Record<string, string> | Pre-computed content hashes for staleness detection |
Lookup Functions
| Function | Signature | Purpose |
|---|---|---|
findParentCategoryByLabel | (label: string) => TaxonomyType | null | Look up a parent category by its human-readable label (case-insensitive) |
findParentCategoryById | (id: string) => TaxonomyType | null | Look up a parent category by its string ID (e.g., "auto", "technology") |
computeParentCategoryHash | (taxonomyId: string) => string | Compute an 8-char hex content hash for a parent category definition |
detectParentCategory | (topicName: string, learnedOverride?, taxonomyTypeHint?) => TaxonomyType | Detect the best-matching parent category from topic text (in local-fallback.ts) |
TaxonomyType Interface
Each parent category definition follows this shape:
interface TaxonomyType {
id: string; // e.g. "auto", "technology"
label: string; // e.g. "Auto", "Business Technology"
taxonomy_type: string; // e.g. "Automotive & Vehicles" (one of 13 groups)
audience_type: string; // e.g. "In-Market Buyers"
iab: {
code: string; // e.g. "IAB2", "IAB19"
content: string; // e.g. "Automotive", "Technology & Computing"
};
description: string; // One-line domain summary
domain_signals: string[]; // Keywords for classification matching
example_topics: string[]; // Representative brands/topics
routing_note?: string; // Optional note (used by Cross-Cutting)
}
Usage Examples
import {
PARENT_CATEGORIES,
TAXONOMY_TYPE_ORDER,
PARENT_CATEGORY_LIST,
findParentCategoryByLabel,
findParentCategoryById,
} from "@/lib/constants/taxonomy-types";
// Look up by label
const auto = findParentCategoryByLabel("Auto");
// => { id: "auto", label: "Auto", taxonomy_type: "Automotive & Vehicles", ... }
// Look up by ID
const tech = findParentCategoryById("technology");
// => { id: "technology", label: "Business Technology", taxonomy_type: "Technology & Telecom", ... }
// Iterate all 13 taxonomy type groups
for (const group of TAXONOMY_TYPE_ORDER) {
const categories = PARENT_CATEGORY_LIST.filter(c => c.taxonomy_type === group);
console.log(`${group}: ${categories.map(c => c.label).join(", ")}`);
}
// Get the IAB code for a category
const insurance = findParentCategoryByLabel("Insurance");
console.log(insurance?.iab.code); // "IAB13-7"
Engine Version and Staleness
Each topic is stamped with the ENGINE_VERSION at classification time (currently "2.5"). Each parent category definition also has a content hash (PARENT_CATEGORY_HASHES) computed from its label, description, domain signals, and example topics.
When either the engine version or a category hash changes, topics classified under the old version/hash are considered "stale" or "outdated" and can be reclassified to bring them up to date.
See the Architecture Overview for details on the reclassification workflow.
Next Steps
- Field Mapping -- Understand the DB-to-TS column swap that affects how you query these structures
- Architecture Overview -- See how the taxonomy fits into the classification pipeline
- Quick Start -- Set up your development environment and classify your first topic