Quick Start (Developer)
This guide walks you through setting up a local AudienceGPT development environment from scratch. By the end, you will have a running dev server, a connected database, and your first audience segment classified.
AudienceGPT is a Next.js 16 application that uses Bun as its JavaScript runtime, Neon PostgreSQL for storage, Clerk for authentication, and the Anthropic Claude API for AI-powered classification.
Prerequisites
Install the following before proceeding:
| Tool | Version | Purpose |
|---|---|---|
| Bun | 1.x+ | JavaScript runtime and package manager |
| Git | 2.x+ | Version control |
| Neon account | -- | Serverless PostgreSQL with pgvector |
| Clerk account | -- | Authentication and multi-tenant organizations |
| Anthropic API key | -- | Claude AI for classification |
AudienceGPT uses Bun as its runtime. Do not use npm, yarn, or pnpm. All scripts, tests, and the migration runner expect Bun.
Clone and Install
git clone https://github.com/your-org/taxonomy_advisor.git
cd taxonomy_advisor
bun install
Bun reads package.json and installs all dependencies into node_modules/. The lockfile is bun.lock.
Environment Setup
Create a .env.local file in the project root with the following required variables:
# Anthropic — powers the classification API and chat
ANTHROPIC_API_KEY=sk-ant-api03-...
# Neon PostgreSQL — connection string from your Neon dashboard
DATABASE_URL=postgresql://user:pass@ep-xxx.us-east-2.aws.neon.tech/neondb?sslmode=require
# Clerk — from your Clerk dashboard > API Keys
NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY=pk_test_...
CLERK_SECRET_KEY=sk_test_...
# Encryption — passphrase for pgcrypto credential encryption (32+ characters)
CREDENTIALS_ENCRYPTION_KEY=your-long-random-passphrase-at-least-32-chars
Optional Variables
# Override the classification model (default: claude-sonnet-4-6)
CLASSIFICATION_MODEL=claude-sonnet-4-6
# Stripe billing integration
STRIPE_SECRET_KEY=sk_test_...
STRIPE_WEBHOOK_SECRET=whsec_...
# Brandfetch for org logo lookups
NEXT_PUBLIC_BRANDFETCH_CLIENT_ID=...
.env.local is listed in .gitignore. Never commit this file or hardcode API keys in source code. The pre-commit hook will reject commits containing secret patterns.
Database Setup
Create a Neon Project
- Sign in to Neon Console
- Create a new project (any region)
- Copy the connection string -- it looks like
postgresql://user:pass@ep-xxx.region.aws.neon.tech/neondb?sslmode=require - Paste it as
DATABASE_URLin your.env.local
The pgvector extension is enabled automatically by the first migration.
Run Migrations
AudienceGPT uses 34 numbered SQL migration files in the migrations/ directory. The migration runner tracks applied migrations in a _migrations table.
# Apply all pending migrations
bun run migrate
# Check which migrations have been applied
bun run migrate:status
Each migration is idempotent (uses IF NOT EXISTS guards), so re-running is safe. On failure, the migration is not recorded and can be retried.
Migrations are named NNNN_description.sql (e.g., 0001_initial.sql through 0034_user_behavior_column.sql). Some numbers have two files (e.g., 0015, 0017, 0018, 0019) due to concurrent feature branches -- this is expected and harmless.
Running the Dev Server
bun run dev
The development server starts at http://localhost:3000. Hot module replacement is active -- code changes reflect instantly.
If you plan to run bun run build, kill the dev server first to avoid resource contention. Builds use Turbopack and are resource-intensive (~20 minutes on constrained hardware).
First Classification Test
- Open http://localhost:3000/classify in your browser
- Sign in through Clerk (create your first user and organization)
- Type a topic name in the chat, for example:
Salesforce CRM - The chatbot will gather context, then classify the topic through the 7-layer engine
- Review the result -- you should see:
- Parent Category: Business Technology
- Taxonomy Type: Technology & Telecom
- Segment Type: B2B
- Audience Type: Business Decision Makers
- Platform-specific segment names (Trade Desk, LiveRamp, Internal)
- 7-layer classification breakdown (intent, intensity, awareness, segment, sensitivity, buyer journey, composite score)
If the Anthropic API is unavailable, the system falls back to a deterministic local classifier that uses regex-based pattern matching.
Running Tests
AudienceGPT uses Bun's built-in test runner with happy-dom for DOM simulation:
# Run the full test suite
bun test
# Watch mode for development
bun test --watch
# Generate coverage report
bun test --coverage
Tests use TaxonomyMemoryStore (an in-memory implementation of ITaxonomyStore) so they run without a database connection.
Linting and Type Checking
# ESLint
bun run lint
# TypeScript type check (no emit)
bun run typecheck
# Detect unused deps, exports, and files
bun run knip
All four checks (lint, typecheck, knip, bun test) run automatically via the pre-commit hook. Every check must pass before a commit is accepted.
The project uses Husky with lint-staged. The pre-commit hook runs ESLint on changed *.{ts,tsx} files, then tsc --noEmit, knip, and bun test. Do not use --no-verify to bypass it.
Building for Production
# Production build (Next.js 16 + Turbopack)
bun run build
# Start the production server
bun run start
The build compiles all pages, API routes, and server components. The output is in .next/.
Useful Scripts Reference
| Script | Command | Description |
|---|---|---|
| Dev server | bun run dev | Start development server on port 3000 |
| Build | bun run build | Production build with Turbopack |
| Start | bun run start | Run production server |
| Lint | bun run lint | ESLint check |
| Typecheck | bun run typecheck | TypeScript strict mode check |
| Knip | bun run knip | Detect unused code |
| Test | bun test | Run test suite |
| Migrate | bun run migrate | Apply pending DB migrations |
| Migrate status | bun run migrate:status | Show applied/pending migrations |
| Reclassify | bun run reclassify-global | Reclassify outdated global topics |
| Encrypt creds | bun run encrypt-credentials | Encrypt plaintext sync credentials |
| Refresh activations | bun run refresh-activations | Refresh stale LiveRamp activations |
Project Structure Overview
taxonomy_advisor/
src/
app/ # Next.js App Router pages and API routes
api/ # Server-side API routes
(app)/ # Application pages (classify, library, sync, etc.)
components/ # React components (advisor-shell, library, admin, etc.)
config/ # Configuration (models, pricing, dimensions)
hooks/ # React hooks (use-classification, use-import, etc.)
lib/
auth/ # Auth guards, API keys, SDK keys
classification/ # 7-layer engine, reclassify, batch-classifier
connections/ # Platform connections, activation, sync runs
constants/ # Taxonomy types, engine version, classification constants
import/ # Import pipeline constants and review scoring
naming/ # DSP names, template engine
signals/ # UCP signal generation
storage/ # ITaxonomyStore interface, Neon + Memory implementations
sync/ # Sync pipeline (fetch, process, types)
admin/ # Admin job handlers, global topic management
proxy.ts # Clerk auth middleware
migrations/ # Numbered SQL migration files
scripts/ # CLI scripts (reclassify, seed trees, etc.)
public/ # Static assets
Next Steps
- Architecture Overview -- Understand the system design, classification pipeline, and multi-tenant architecture
- Field Mapping -- Learn the critical DB-to-TS field swap convention before writing any SQL
- Taxonomy Structure -- Explore the full taxonomy hierarchy, parent categories, and tree structure