Skip to main content

Taxonomy Structure

AudienceGPT organizes audience segments into a multi-level hierarchy that flows from broad industry groups down to individual classified topics. This page documents the full hierarchy, lists every taxonomy type and parent category, explains the subcategory tree system, and provides a reference for the constants and lookup functions available in code.

Understanding this hierarchy is essential for working with classification results, building filters, and writing queries against the catalog.

Hierarchy Overview

The taxonomy reads top-down through six levels:

LevelNameCountSource
1Taxonomy Type13Hardcoded in TAXONOMY_TYPE_ORDER
2Parent Category41Hardcoded in PARENT_CATEGORIES
3Subcategory L0Variabletaxonomy_tree table (level = 0)
4Subcategory L1Variabletaxonomy_tree table (level = 1)
5Subcategory L2+Variabletaxonomy_tree table (level >= 2)
6TopicUnlimitedtopics table
TS vs DB naming

Remember that what TypeScript calls taxonomy_type (13 groups) is stored in the DB column parent_category, and what TypeScript calls parent_category (41 types) is stored in the DB column taxonomy_type. See the Field Mapping page for details.

The 13 Taxonomy Types

Taxonomy types are the broadest grouping level. They are defined in TAXONOMY_TYPE_ORDER in src/lib/constants/taxonomy-types.ts and displayed in this fixed order:

#Taxonomy TypeParent Categories
1Automotive & VehiclesAuto, Recreational Vehicles
2Home & PropertyReal Estate, Home & Garden / Home Improvement, Home Services
3Financial & LegalFinancial Services, Insurance, Legal Services
4Technology & TelecomBusiness Technology, Technographics, Telecommunications, Consumer Electronics, Consumer Technology
5Consumer Goods & RetailConsumer Goods, Food & Beverage, Apparel & Accessories, Beauty & Personal Care, Babies & Children, Pets & Animals
6HealthHealth & Wellness
7EducationEducation
8Travel & HospitalityTravel & Leisure
9Entertainment & MediaEntertainment, Sports & Fitness, Video Gaming, Gambling & Casino, News & Media
10Lifestyle & Special InterestAlcohol & Spirits, Cannabis, Gifting & Occasions, Sustainability & Green Living, Luxury & Premium
11Civic & CauseCharities & Nonprofits, Politics, Spiritual & Religion
12B2B & IndustrialBusiness & Professional Services, Agriculture & Farming, Transportation & Logistics, Energy & Utilities, Government & Public Sector
13Cross-CuttingLife-Stage (Inferred)

The 41 Parent Categories

Each parent category is a specialized classification domain with its own IAB code, audience type, domain signals, and example topics. They are defined as the PARENT_CATEGORIES record in src/lib/constants/taxonomy-types.ts.

IDLabelTaxonomy TypeAudience TypeIAB Code
autoAutoAutomotive & VehiclesIn-Market BuyersIAB2
recreational_vehiclesRecreational VehiclesAutomotive & VehiclesIn-Market BuyersIAB2
real_estateReal EstateHome & PropertyIn-Market BuyersIAB21
home_gardenHome & Garden / Home ImprovementHome & PropertyIn-Market BuyersIAB10
home_servicesHome ServicesHome & PropertyIn-Market BuyersIAB10
financial_servicesFinancial ServicesFinancial & LegalFinancial PlannersIAB13
insuranceInsuranceFinancial & LegalFinancial PlannersIAB13-7
legal_servicesLegal ServicesFinancial & LegalFinancial PlannersIAB11
technologyBusiness TechnologyTechnology & TelecomBusiness Decision MakersIAB19
technographicsTechnographicsTechnology & TelecomTechnology Infrastructure BuyersIAB19
telecommunicationsTelecommunicationsTechnology & TelecomIn-Market BuyersIAB19
consumer_electronicsConsumer ElectronicsTechnology & TelecomIn-Market BuyersIAB19-6
consumer_technologyConsumer TechnologyTechnology & TelecomIn-Market BuyersIAB19
consumer_goodsConsumer GoodsConsumer Goods & RetailIn-Market BuyersIAB22
food_beverageFood & BeverageConsumer Goods & RetailIn-Market BuyersIAB8
apparel_accessoriesApparel & AccessoriesConsumer Goods & RetailIn-Market BuyersIAB18
beauty_personal_careBeauty & Personal CareConsumer Goods & RetailIn-Market BuyersIAB18
babies_childrenBabies & ChildrenConsumer Goods & RetailIn-Market BuyersIAB6
pets_animalsPets & AnimalsConsumer Goods & RetailIn-Market BuyersIAB16
health_wellnessHealth & WellnessHealthHealth-Conscious ConsumersIAB7
educationEducationEducationLearners & StudentsIAB5
travel_leisureTravel & LeisureTravel & HospitalityActive TravelersIAB20
entertainmentEntertainmentEntertainment & MediaActive EnthusiastsIAB1
sports_fitnessSports & FitnessEntertainment & MediaActive EnthusiastsIAB17
video_gamingVideo GamingEntertainment & MediaActive EnthusiastsIAB9
gambling_casinoGambling & CasinoEntertainment & MediaRegulated ConsumersIAB9
news_mediaNews & MediaEntertainment & MediaActive EnthusiastsIAB12
alcohol_spiritsAlcohol & SpiritsLifestyle & Special InterestRegulated ConsumersIAB8
cannabisCannabisLifestyle & Special InterestRegulated ConsumersIAB7
gifting_occasionsGifting & OccasionsLifestyle & Special InterestIn-Market BuyersIAB22
sustainabilitySustainability & Green LivingLifestyle & Special InterestSustainability AdvocatesIAB10
luxuryLuxury & PremiumLifestyle & Special InterestIn-Market BuyersIAB18
charities_nonprofitsCharities & NonprofitsCivic & CauseActive SupportersIAB11
politicsPoliticsCivic & CauseActive SupportersIAB11
spiritual_religionSpiritual & ReligionCivic & CauseActive SupportersIAB23
business_professional_servicesBusiness & Professional ServicesB2B & IndustrialBusiness Decision MakersIAB3
agricultureAgriculture & FarmingB2B & IndustrialAgricultural OperatorsIAB3
transportation_logisticsTransportation & LogisticsB2B & IndustrialBusiness Decision MakersIAB3
energy_utilitiesEnergy & UtilitiesB2B & IndustrialEnergy & Utility ConsumersIAB3
government_public_sectorGovernment & Public SectorB2B & IndustrialBusiness Decision MakersIAB11
life_stageLife-Stage (Inferred)Cross-CuttingInferred AudiencesIAB6

Each parent category definition also includes:

  • description -- One-line summary of the domain
  • domain_signals -- Keywords and abbreviations that indicate relevance (used by the classification engine)
  • example_topics -- Representative brands and topics for training and testing

Audience Types

The audience_type field is denormalized onto each topic at classification time, derived from the parent category definition. There are 14 distinct audience types:

Audience TypeUsed By
In-Market BuyersAuto, Recreational Vehicles, Real Estate, Home & Garden, Home Services, Telecommunications, Consumer Electronics, Consumer Technology, Consumer Goods, Food & Beverage, Apparel & Accessories, Beauty & Personal Care, Babies & Children, Pets & Animals, Gifting & Occasions, Luxury & Premium
Financial PlannersFinancial Services, Insurance, Legal Services
Business Decision MakersBusiness Technology, Business & Professional Services, Transportation & Logistics, Government & Public Sector
Technology Infrastructure BuyersTechnographics
Health-Conscious ConsumersHealth & Wellness
Learners & StudentsEducation
Active TravelersTravel & Leisure
Active EnthusiastsEntertainment, Sports & Fitness, Video Gaming, News & Media
Regulated ConsumersGambling & Casino, Alcohol & Spirits, Cannabis
Sustainability AdvocatesSustainability & Green Living
Active SupportersCharities & Nonprofits, Politics, Spiritual & Religion
Agricultural OperatorsAgriculture & Farming
Energy & Utility ConsumersEnergy & Utilities
Inferred AudiencesLife-Stage (Inferred)

The library UI supports filtering by audience type. UCP signal generation uses the stored audience type with a runtime lookup fallback for backward compatibility.

IAB Content Taxonomy Codes

Each parent category maps to an IAB content taxonomy code for programmatic advertising compatibility. The codes follow the IAB Tech Lab Content Taxonomy standard:

IAB CodeIAB CategoryParent Categories
IAB1Arts & EntertainmentEntertainment
IAB2AutomotiveAuto, Recreational Vehicles
IAB3BusinessBusiness & Professional Services, Agriculture, Transportation & Logistics, Energy & Utilities, Government & Public Sector
IAB5EducationEducation
IAB6Family & ParentingBabies & Children, Life-Stage (Inferred)
IAB7Health & FitnessHealth & Wellness, Cannabis
IAB8Food & DrinkFood & Beverage, Alcohol & Spirits
IAB9Hobbies & InterestsVideo Gaming, Gambling & Casino
IAB10Home & GardenHome & Garden, Home Services, Sustainability
IAB11Law, Government & PoliticsLegal Services, Charities & Nonprofits, Politics, Government & Public Sector
IAB12NewsNews & Media
IAB13Personal FinanceFinancial Services
IAB13-7InsuranceInsurance
IAB16PetsPets & Animals
IAB17SportsSports & Fitness
IAB18Style & FashionApparel & Accessories, Beauty & Personal Care, Luxury & Premium
IAB19Technology & ComputingBusiness Technology, Technographics, Telecommunications, Consumer Technology
IAB19-6Consumer ElectronicsConsumer Electronics
IAB20TravelTravel & Leisure
IAB21Real EstateReal Estate
IAB22ShoppingConsumer Goods, Gifting & Occasions
IAB23Religion & SpiritualitySpiritual & Religion

The Subcategory Tree

Below the 41 parent categories, subcategories are organized in a hierarchical tree stored in the taxonomy_tree database table (migration 0031_taxonomy_tree.sql).

Table Schema

CREATE TABLE taxonomy_tree (
id TEXT PRIMARY KEY,
taxonomy_type TEXT NOT NULL, -- DB column: stores 41 parent category labels
parent_id TEXT REFERENCES taxonomy_tree(id),
label TEXT NOT NULL, -- Display name of this node
level SMALLINT NOT NULL, -- 0 = root, 1 = child, 2 = grandchild, ...
path TEXT NOT NULL, -- Materialized path: "L0 > L1 > L2"
sort_order INTEGER NOT NULL,
source TEXT NOT NULL DEFAULT 'seed',
is_active BOOLEAN NOT NULL DEFAULT true,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE(taxonomy_type, path)
);
taxonomy_type column in taxonomy_tree

The taxonomy_type column in the taxonomy_tree table follows the DB naming convention -- it stores the 41 parent category labels (e.g., "Auto", "Business Technology"), not the 13 group names. This is consistent with the topics table but opposite to the TS property name.

Adjacency List + Materialized Path

The tree uses a dual-navigation pattern:

  • Adjacency list: Each node has a parent_id pointing to its immediate parent. Root nodes have parent_id = NULL.
  • Materialized path: The path column stores the full path from root to the node, using > as the delimiter.
Level 0 path: "Electric Vehicles"
Level 1 path: "Electric Vehicles > Battery Technology"
Level 2 path: "Electric Vehicles > Battery Technology > Solid State"

This dual approach enables both recursive tree traversal (via parent_id) and fast path-based filtering (via LIKE on path).

Linking Topics to Tree Nodes

Topics connect to the tree via two columns on the topics table:

ColumnTypePurpose
taxonomy_node_idTEXT FKForeign key to taxonomy_tree.id
taxonomy_pathTEXTDenormalized materialized path for fast filtering

The taxonomy_path column enables multi-level subcategory filtering in the catalog API without requiring a JOIN:

-- Filter by L0 subcategory
WHERE SPLIT_PART(taxonomy_path, ' > ', 1) = 'Electric Vehicles'

-- Filter by L1 subcategory
WHERE SPLIT_PART(taxonomy_path, ' > ', 2) = 'Battery Technology'

-- Filter by L2 subcategory
WHERE SPLIT_PART(taxonomy_path, ' > ', 3) = 'Solid State'

Seeding and Maintenance

Tree nodes are populated by seed scripts and maintained by batch realignment:

ScriptCommandPurpose
Seed all treesbun run scripts/seed-all-trees.shRun all seed-*-trees.ts scripts
Realign subcategoriesbun run scripts/realign-subcategories.tsLLM batch reclassify topic subcategories to tree node labels
Fix subcategory L0bun run scripts/fix-subcategory-level0.tsUpdate subcategory display values to match tree L0 labels
Backfill nodesbun run scripts/backfill-taxonomy-nodes-llm.tsBackfill taxonomy_node_id via LLM batches

The realignment script uses Claude Haiku in batches to match each topic's free-text subcategory to the closest tree node label, then sets the taxonomy_node_id foreign key.

Segment Types

Each classified topic is assigned a segment type indicating the business model orientation:

Segment TypeDescription
B2BBusiness-to-Business
B2CBusiness-to-Consumer
B2B2CBusiness-to-Business-to-Consumer
B2EBusiness-to-Employee
B2GBusiness-to-Government

Segment type is determined first by taxonomy lookup (certain parent categories default to B2B or B2C), then by keyword-based heuristics when the taxonomy lookup is ambiguous.

No fallback for segment_type

By design, there is no local fallback for segment_type when the classifier cannot determine it. The system errors intentionally to give the super admin the opportunity to train the system for that edge case.

Constants and Lookup Functions Reference

All taxonomy constants live in src/lib/constants/taxonomy-types.ts:

Data Structures

ExportTypeDescription
PARENT_CATEGORIESRecord<string, TaxonomyType>All 41 parent category definitions keyed by ID
TAXONOMY_TYPE_ORDERstring[]Ordered list of 13 taxonomy type group labels
PARENT_CATEGORY_LISTTaxonomyType[]All 41 categories sorted by taxonomy type then label
PARENT_CATEGORY_LABELSstring[]Labels of all 41 categories in sorted order
AUDIENCE_TYPE_LABELSstring[]Distinct audience type labels (14 values)
TAXONOMY_TYPE_MAPRecord<string, string>Maps external names (Data Alliance verticals, technographics categories, etc.) to parent category IDs
PARENT_CATEGORY_HASHESRecord<string, string>Pre-computed content hashes for staleness detection

Lookup Functions

FunctionSignaturePurpose
findParentCategoryByLabel(label: string) => TaxonomyType | nullLook up a parent category by its human-readable label (case-insensitive)
findParentCategoryById(id: string) => TaxonomyType | nullLook up a parent category by its string ID (e.g., "auto", "technology")
computeParentCategoryHash(taxonomyId: string) => stringCompute an 8-char hex content hash for a parent category definition
detectParentCategory(topicName: string, learnedOverride?, taxonomyTypeHint?) => TaxonomyTypeDetect the best-matching parent category from topic text (in local-fallback.ts)

TaxonomyType Interface

Each parent category definition follows this shape:

interface TaxonomyType {
id: string; // e.g. "auto", "technology"
label: string; // e.g. "Auto", "Business Technology"
taxonomy_type: string; // e.g. "Automotive & Vehicles" (one of 13 groups)
audience_type: string; // e.g. "In-Market Buyers"
iab: {
code: string; // e.g. "IAB2", "IAB19"
content: string; // e.g. "Automotive", "Technology & Computing"
};
description: string; // One-line domain summary
domain_signals: string[]; // Keywords for classification matching
example_topics: string[]; // Representative brands/topics
routing_note?: string; // Optional note (used by Cross-Cutting)
}

Usage Examples

import {
PARENT_CATEGORIES,
TAXONOMY_TYPE_ORDER,
PARENT_CATEGORY_LIST,
findParentCategoryByLabel,
findParentCategoryById,
} from "@/lib/constants/taxonomy-types";

// Look up by label
const auto = findParentCategoryByLabel("Auto");
// => { id: "auto", label: "Auto", taxonomy_type: "Automotive & Vehicles", ... }

// Look up by ID
const tech = findParentCategoryById("technology");
// => { id: "technology", label: "Business Technology", taxonomy_type: "Technology & Telecom", ... }

// Iterate all 13 taxonomy type groups
for (const group of TAXONOMY_TYPE_ORDER) {
const categories = PARENT_CATEGORY_LIST.filter(c => c.taxonomy_type === group);
console.log(`${group}: ${categories.map(c => c.label).join(", ")}`);
}

// Get the IAB code for a category
const insurance = findParentCategoryByLabel("Insurance");
console.log(insurance?.iab.code); // "IAB13-7"

Engine Version and Staleness

Each topic is stamped with the ENGINE_VERSION at classification time (currently "2.5"). Each parent category definition also has a content hash (PARENT_CATEGORY_HASHES) computed from its label, description, domain signals, and example topics.

When either the engine version or a category hash changes, topics classified under the old version/hash are considered "stale" or "outdated" and can be reclassified to bring them up to date.

See the Architecture Overview for details on the reclassification workflow.

Next Steps

  • Field Mapping -- Understand the DB-to-TS column swap that affects how you query these structures
  • Architecture Overview -- See how the taxonomy fits into the classification pipeline
  • Quick Start -- Set up your development environment and classify your first topic