Skip to main content

Quick Start (Developer)

This guide walks you through setting up a local AudienceGPT development environment from scratch. By the end, you will have a running dev server, a connected database, and your first audience segment classified.

AudienceGPT is a Next.js 16 application that uses Bun as its JavaScript runtime, Neon PostgreSQL for storage, Clerk for authentication, and the Anthropic Claude API for AI-powered classification.

Prerequisites

Install the following before proceeding:

ToolVersionPurpose
Bun1.x+JavaScript runtime and package manager
Git2.x+Version control
Neon account--Serverless PostgreSQL with pgvector
Clerk account--Authentication and multi-tenant organizations
Anthropic API key--Claude AI for classification
Bun, not Node

AudienceGPT uses Bun as its runtime. Do not use npm, yarn, or pnpm. All scripts, tests, and the migration runner expect Bun.

Clone and Install

git clone https://github.com/your-org/taxonomy_advisor.git
cd taxonomy_advisor
bun install

Bun reads package.json and installs all dependencies into node_modules/. The lockfile is bun.lock.

Environment Setup

Create a .env.local file in the project root with the following required variables:

# Anthropic — powers the classification API and chat
ANTHROPIC_API_KEY=sk-ant-api03-...

# Neon PostgreSQL — connection string from your Neon dashboard
DATABASE_URL=postgresql://user:pass@ep-xxx.us-east-2.aws.neon.tech/neondb?sslmode=require

# Clerk — from your Clerk dashboard > API Keys
NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY=pk_test_...
CLERK_SECRET_KEY=sk_test_...

# Encryption — passphrase for pgcrypto credential encryption (32+ characters)
CREDENTIALS_ENCRYPTION_KEY=your-long-random-passphrase-at-least-32-chars

Optional Variables

# Override the classification model (default: claude-sonnet-4-6)
CLASSIFICATION_MODEL=claude-sonnet-4-6

# Stripe billing integration
STRIPE_SECRET_KEY=sk_test_...
STRIPE_WEBHOOK_SECRET=whsec_...

# Brandfetch for org logo lookups
NEXT_PUBLIC_BRANDFETCH_CLIENT_ID=...
Never commit secrets

.env.local is listed in .gitignore. Never commit this file or hardcode API keys in source code. The pre-commit hook will reject commits containing secret patterns.

Database Setup

Create a Neon Project

  1. Sign in to Neon Console
  2. Create a new project (any region)
  3. Copy the connection string -- it looks like postgresql://user:pass@ep-xxx.region.aws.neon.tech/neondb?sslmode=require
  4. Paste it as DATABASE_URL in your .env.local

The pgvector extension is enabled automatically by the first migration.

Run Migrations

AudienceGPT uses 34 numbered SQL migration files in the migrations/ directory. The migration runner tracks applied migrations in a _migrations table.

# Apply all pending migrations
bun run migrate

# Check which migrations have been applied
bun run migrate:status

Each migration is idempotent (uses IF NOT EXISTS guards), so re-running is safe. On failure, the migration is not recorded and can be retried.

Migration numbering

Migrations are named NNNN_description.sql (e.g., 0001_initial.sql through 0034_user_behavior_column.sql). Some numbers have two files (e.g., 0015, 0017, 0018, 0019) due to concurrent feature branches -- this is expected and harmless.

Running the Dev Server

bun run dev

The development server starts at http://localhost:3000. Hot module replacement is active -- code changes reflect instantly.

Kill dev before building

If you plan to run bun run build, kill the dev server first to avoid resource contention. Builds use Turbopack and are resource-intensive (~20 minutes on constrained hardware).

First Classification Test

  1. Open http://localhost:3000/classify in your browser
  2. Sign in through Clerk (create your first user and organization)
  3. Type a topic name in the chat, for example: Salesforce CRM
  4. The chatbot will gather context, then classify the topic through the 7-layer engine
  5. Review the result -- you should see:
    • Parent Category: Business Technology
    • Taxonomy Type: Technology & Telecom
    • Segment Type: B2B
    • Audience Type: Business Decision Makers
    • Platform-specific segment names (Trade Desk, LiveRamp, Internal)
    • 7-layer classification breakdown (intent, intensity, awareness, segment, sensitivity, buyer journey, composite score)

If the Anthropic API is unavailable, the system falls back to a deterministic local classifier that uses regex-based pattern matching.

Running Tests

AudienceGPT uses Bun's built-in test runner with happy-dom for DOM simulation:

# Run the full test suite
bun test

# Watch mode for development
bun test --watch

# Generate coverage report
bun test --coverage

Tests use TaxonomyMemoryStore (an in-memory implementation of ITaxonomyStore) so they run without a database connection.

Linting and Type Checking

# ESLint
bun run lint

# TypeScript type check (no emit)
bun run typecheck

# Detect unused deps, exports, and files
bun run knip

All four checks (lint, typecheck, knip, bun test) run automatically via the pre-commit hook. Every check must pass before a commit is accepted.

Do not skip the pre-commit hook

The project uses Husky with lint-staged. The pre-commit hook runs ESLint on changed *.{ts,tsx} files, then tsc --noEmit, knip, and bun test. Do not use --no-verify to bypass it.

Building for Production

# Production build (Next.js 16 + Turbopack)
bun run build

# Start the production server
bun run start

The build compiles all pages, API routes, and server components. The output is in .next/.

Useful Scripts Reference

ScriptCommandDescription
Dev serverbun run devStart development server on port 3000
Buildbun run buildProduction build with Turbopack
Startbun run startRun production server
Lintbun run lintESLint check
Typecheckbun run typecheckTypeScript strict mode check
Knipbun run knipDetect unused code
Testbun testRun test suite
Migratebun run migrateApply pending DB migrations
Migrate statusbun run migrate:statusShow applied/pending migrations
Reclassifybun run reclassify-globalReclassify outdated global topics
Encrypt credsbun run encrypt-credentialsEncrypt plaintext sync credentials
Refresh activationsbun run refresh-activationsRefresh stale LiveRamp activations

Project Structure Overview

taxonomy_advisor/
src/
app/ # Next.js App Router pages and API routes
api/ # Server-side API routes
(app)/ # Application pages (classify, library, sync, etc.)
components/ # React components (advisor-shell, library, admin, etc.)
config/ # Configuration (models, pricing, dimensions)
hooks/ # React hooks (use-classification, use-import, etc.)
lib/
auth/ # Auth guards, API keys, SDK keys
classification/ # 7-layer engine, reclassify, batch-classifier
connections/ # Platform connections, activation, sync runs
constants/ # Taxonomy types, engine version, classification constants
import/ # Import pipeline constants and review scoring
naming/ # DSP names, template engine
signals/ # UCP signal generation
storage/ # ITaxonomyStore interface, Neon + Memory implementations
sync/ # Sync pipeline (fetch, process, types)
admin/ # Admin job handlers, global topic management
proxy.ts # Clerk auth middleware
migrations/ # Numbered SQL migration files
scripts/ # CLI scripts (reclassify, seed trees, etc.)
public/ # Static assets

Next Steps

  • Architecture Overview -- Understand the system design, classification pipeline, and multi-tenant architecture
  • Field Mapping -- Learn the critical DB-to-TS field swap convention before writing any SQL
  • Taxonomy Structure -- Explore the full taxonomy hierarchy, parent categories, and tree structure