Quick Start (Developer)

This guide walks you through setting up a local AudienceGPT development environment from scratch. By the end, you will have a running dev server, a connected database, and your first audience segment classified.

AudienceGPT is a Next.js 16 application that uses Bun as its JavaScript runtime, Neon PostgreSQL for storage, Clerk for authentication, and the Anthropic Claude API for AI-powered classification.

Prerequisites

Install the following before proceeding:

Tool	Version	Purpose
Bun	1.x+	JavaScript runtime and package manager
Git	2.x+	Version control
Neon account	--	Serverless PostgreSQL with pgvector
Clerk account	--	Authentication and multi-tenant organizations
Anthropic API key	--	Claude AI for classification

Bun, not Node

AudienceGPT uses Bun as its runtime. Do not use npm, yarn, or pnpm. All scripts, tests, and the migration runner expect Bun.

Clone and Install

git clone https://github.com/your-org/taxonomy_advisor.git
cd taxonomy_advisor
bun install

Bun reads package.json and installs all dependencies into node_modules/. The lockfile is bun.lock.

Environment Setup

Create a .env.local file in the project root with the following required variables:

# Anthropic — powers the classification API and chat
ANTHROPIC_API_KEY=sk-ant-api03-...

# Neon PostgreSQL — connection string from your Neon dashboard
DATABASE_URL=postgresql://user:pass@ep-xxx.us-east-2.aws.neon.tech/neondb?sslmode=require

# Clerk — from your Clerk dashboard > API Keys
NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY=pk_test_...
CLERK_SECRET_KEY=sk_test_...

# Encryption — passphrase for pgcrypto credential encryption (32+ characters)
CREDENTIALS_ENCRYPTION_KEY=your-long-random-passphrase-at-least-32-chars

Optional Variables

# Override the classification model (default: claude-sonnet-4-6)
CLASSIFICATION_MODEL=claude-sonnet-4-6

# Stripe billing integration
STRIPE_SECRET_KEY=sk_test_...
STRIPE_WEBHOOK_SECRET=whsec_...

# Brandfetch for org logo lookups
NEXT_PUBLIC_BRANDFETCH_CLIENT_ID=...

Never commit secrets

.env.local is listed in .gitignore. Never commit this file or hardcode API keys in source code. The pre-commit hook will reject commits containing secret patterns.

Database Setup

Create a Neon Project

Sign in to Neon Console
Create a new project (any region)
Copy the connection string -- it looks like postgresql://user:pass@ep-xxx.region.aws.neon.tech/neondb?sslmode=require
Paste it as DATABASE_URL in your .env.local

The pgvector extension is enabled automatically by the first migration.

Run Migrations

AudienceGPT uses 34 numbered SQL migration files in the migrations/ directory. The migration runner tracks applied migrations in a _migrations table.

# Apply all pending migrations
bun run migrate

# Check which migrations have been applied
bun run migrate:status

Each migration is idempotent (uses IF NOT EXISTS guards), so re-running is safe. On failure, the migration is not recorded and can be retried.

Migration numbering

Migrations are named NNNN_description.sql (e.g., 0001_initial.sql through 0034_user_behavior_column.sql). Some numbers have two files (e.g., 0015, 0017, 0018, 0019) due to concurrent feature branches -- this is expected and harmless.

Running the Dev Server

bun run dev

The development server starts at http://localhost:3000. Hot module replacement is active -- code changes reflect instantly.

Kill dev before building

If you plan to run bun run build, kill the dev server first to avoid resource contention. Builds use Turbopack and are resource-intensive (~20 minutes on constrained hardware).

First Classification Test

Open http://localhost:3000/classify in your browser
Sign in through Clerk (create your first user and organization)
Type a topic name in the chat, for example: Salesforce CRM
The chatbot will gather context, then classify the topic through the 7-layer engine
Review the result -- you should see:
- Parent Category: Business Technology
- Taxonomy Type: Technology & Telecom
- Segment Type: B2B
- Audience Type: Business Decision Makers
- Platform-specific segment names (Trade Desk, LiveRamp, Internal)
- 7-layer classification breakdown (intent, intensity, awareness, segment, sensitivity, buyer journey, composite score)

If the Anthropic API is unavailable, the system falls back to a deterministic local classifier that uses regex-based pattern matching.

Running Tests

AudienceGPT uses Bun's built-in test runner with happy-dom for DOM simulation:

# Run the full test suite
bun test

# Watch mode for development
bun test --watch

# Generate coverage report
bun test --coverage

Tests use TaxonomyMemoryStore (an in-memory implementation of ITaxonomyStore) so they run without a database connection.

Linting and Type Checking

# ESLint
bun run lint

# TypeScript type check (no emit)
bun run typecheck

# Detect unused deps, exports, and files
bun run knip

All four checks (lint, typecheck, knip, bun test) run automatically via the pre-commit hook. Every check must pass before a commit is accepted.

Do not skip the pre-commit hook

The project uses Husky with lint-staged. The pre-commit hook runs ESLint on changed *.{ts,tsx} files, then tsc --noEmit, knip, and bun test. Do not use --no-verify to bypass it.

Building for Production

# Production build (Next.js 16 + Turbopack)
bun run build

# Start the production server
bun run start

The build compiles all pages, API routes, and server components. The output is in .next/.

Useful Scripts Reference

Script	Command	Description
Dev server	`bun run dev`	Start development server on port 3000
Build	`bun run build`	Production build with Turbopack
Start	`bun run start`	Run production server
Lint	`bun run lint`	ESLint check
Typecheck	`bun run typecheck`	TypeScript strict mode check
Knip	`bun run knip`	Detect unused code
Test	`bun test`	Run test suite
Migrate	`bun run migrate`	Apply pending DB migrations
Migrate status	`bun run migrate:status`	Show applied/pending migrations
Reclassify	`bun run reclassify-global`	Reclassify outdated global topics
Encrypt creds	`bun run encrypt-credentials`	Encrypt plaintext sync credentials
Refresh activations	`bun run refresh-activations`	Refresh stale LiveRamp activations

Project Structure Overview

taxonomy_advisor/
  src/
    app/              # Next.js App Router pages and API routes
      api/            # Server-side API routes
      (app)/          # Application pages (classify, library, sync, etc.)
    components/       # React components (advisor-shell, library, admin, etc.)
    config/           # Configuration (models, pricing, dimensions)
    hooks/            # React hooks (use-classification, use-import, etc.)
    lib/
      auth/           # Auth guards, API keys, SDK keys
      classification/ # 7-layer engine, reclassify, batch-classifier
      connections/    # Platform connections, activation, sync runs
      constants/      # Taxonomy types, engine version, classification constants
      import/         # Import pipeline constants and review scoring
      naming/         # DSP names, template engine
      signals/        # UCP signal generation
      storage/        # ITaxonomyStore interface, Neon + Memory implementations
      sync/           # Sync pipeline (fetch, process, types)
      admin/          # Admin job handlers, global topic management
    proxy.ts          # Clerk auth middleware
  migrations/         # Numbered SQL migration files
  scripts/            # CLI scripts (reclassify, seed trees, etc.)
  public/             # Static assets

Next Steps

Architecture Overview -- Understand the system design, classification pipeline, and multi-tenant architecture
Field Mapping -- Learn the critical DB-to-TS field swap convention before writing any SQL
Taxonomy Structure -- Explore the full taxonomy hierarchy, parent categories, and tree structure

Prerequisites​

Clone and Install​

Environment Setup​

Optional Variables​

Database Setup​

Create a Neon Project​

Run Migrations​

Running the Dev Server​

First Classification Test​

Running Tests​

Linting and Type Checking​

Building for Production​

Useful Scripts Reference​

Project Structure Overview​

Next Steps​