Rate Limits & Quotas

AudienceGPT enforces per-organization quotas to ensure fair usage and protect shared infrastructure. Quotas are tracked monthly and apply across all authentication methods (Clerk session, API key, and SDK key). When a quota is exceeded, the API returns a 429 status code with details about the limit and reset timing.

Plan Tiers

Each organization is assigned a plan tier that determines its monthly quotas. The following tiers are available:

Metric	Free	Pro	Enterprise
Monthly classifications	100	10,000	Unlimited
Monthly tokens (input + output)	1,000,000	50,000,000	Unlimited
Monthly import operations	5	100	Unlimited
Monthly exports	10	Unlimited	Unlimited

info

All organizations currently default to the Enterprise tier with unlimited quotas. Plan assignment and billing integration are planned for a future release.

What Counts as a Classification

The following operations each consume one classification credit:

POST /api/classify -- Single topic classification
POST /api/sdk/classify -- SDK classification
Each row processed in an import chunk (POST /api/import/:batchId/chunk)
Each item processed in a sync page (POST /api/connections/:id/sync/run/:runId/page)
Each topic in a reclassify request (POST /api/topics/:id/reclassify, POST /api/topics/reclassify)
Each combination in matrix generation (POST /api/topics/generate-matrix)

Per-Endpoint Limits

Beyond monthly quotas, individual endpoints enforce their own operational limits to protect system stability.

Classification Limits

Limit	Value	Endpoint
Max messages per request	50	`POST /api/classify`, `POST /api/sdk/classify`
Max file size (brief upload)	10 MB	`POST /api/analyze-brief`, `POST /api/sdk/analyze-brief`

Import Limits

Limit	Value	Description
Max rows per import	50,000	Total rows in a single import batch
Chunk size	500	Rows processed per chunk request
Max LLM classifications per chunk	10	AI-powered classifications capped to avoid route timeouts
Max retries per chunk	3	Client-side retry with exponential backoff
Chunk processing timeout	60 seconds	Server-side max duration per chunk

Sync Limits

Limit	Value	Description
Default page size	500	Items fetched per sync page
Max pages per sync run	300	Maximum pages (150,000 items total)
Max LLM classifications per page	15	AI-powered classifications per sync page
Fetch timeout	15 seconds	Timeout for external API page fetch
Max retries per page	3	Client-side retry with exponential backoff

Connection Limits

Limit	Value	Description
Max connections per org	20	Across all platforms and directions
Connection test timeout	30 seconds	Max duration for `POST /api/connections/:id/test`
LiveRamp API timeout	30 seconds	Per-request timeout for LiveRamp API calls
LiveRamp auth timeout	10 seconds	OAuth token request timeout
Trade Desk API timeout	30 seconds	Per-request timeout for Trade Desk API calls

Activation Limits

Limit	Value	Description
Push batch size	25	Segments pushed per batch request
Max push retries per segment	3	Retry attempts for failed platform pushes
Max errors per activation run	500	Error records stored per push run
Single deactivate timeout	30 seconds	Max duration for single deactivation
Single refresh timeout	30 seconds	Max duration for single refresh
Bulk deactivate timeout	60 seconds	Max duration for bulk deactivation
Bulk refresh timeout	60 seconds	Max duration for bulk refresh
Max bulk operations per request	500	Activation IDs in bulk deactivate/refresh

Topic Limits

Limit	Value	Description
Bulk reclassify (local)	500	Max topic IDs per bulk reclassify request
Bulk reclassify (LLM)	100	Max topic IDs when using AI-powered reclassify
Bulk delete	No hard limit	Array of topic IDs in DELETE request body
Catalog add by IDs	500	Max topic IDs per catalog add request
Matrix combinations	5,000	Max cartesian product size
Matrix minimum dimensions	2	At least 2 dimensions required

Key Management Limits

Limit	Value	Description
API keys per org	25	Maximum active API keys
SDK keys per org	No hard limit	Publishable keys for client-side use

Export Limits

Limit	Value	Description
Export timeout	55 seconds	Safety timeout for streaming exports
Page size (keyset pagination)	500	Rows fetched per internal page

Quota Exceeded Response

When a monthly quota is exceeded, the API returns:

HTTP/1.1 429 Too Many Requests
Content-Type: application/json

{
  "error": "Monthly classifications quota exceeded",
  "limit": 10000,
  "used": 10000,
  "remaining": 0,
  "resetsAt": "2026-03-01T00:00:00.000Z"
}

Field	Type	Description
`error`	string	Human-readable error message
`limit`	number	Monthly limit for this metric
`used`	number	Current month's usage
`remaining`	number	Remaining quota (always 0 when exceeded)
`resetsAt`	string	ISO 8601 timestamp for when the quota resets (first of next month, UTC)

Usage Tracking

AudienceGPT tracks usage per organization in the usage_daily_summary table with the following metrics:

Metric	Description
`total_classifications`	Classification operations (classify, import, sync, reclassify)
`total_tokens`	Anthropic API tokens consumed (input + output)
`total_exports`	Export operations
`total_web_searches`	Web search tool uses during classification
`total_requests`	Total API requests
`total_cost`	Internal cost (Anthropic API charges)

Usage data is aggregated daily and available through the admin dashboard for organization owners and super admins.

Checking Your Usage

Currently, usage is visible through the admin dashboard. A dedicated API endpoint for programmatic usage queries is planned.

Handling Rate Limits

Retry Strategy

When you receive a 429 response:

Check the error message to determine which quota was exceeded
For monthly quotas -- wait until the reset date or upgrade your plan
For operational limits -- reduce batch sizes or add delays between requests

TypeScript Example

async function classifyWithRetry(
  messages: Array<{ role: string; content: string }>,
  maxRetries = 3
) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch("https://app.audiencegpt.com/api/classify", {
      method: "POST",
      headers: {
        "Authorization": `Bearer ${API_KEY}`,
        "Content-Type": "application/json",
      },
      body: JSON.stringify({ messages }),
    });

    if (response.status === 429) {
      const error = await response.json();
      if (error.resetsAt) {
        // Monthly quota exceeded -- no point retrying
        throw new Error(
          `Monthly quota exceeded. Resets at ${error.resetsAt}`
        );
      }
      // Transient rate limit -- retry with backoff
      const delay = Math.pow(2, attempt) * 1000;
      await new Promise((r) => setTimeout(r, delay));
      continue;
    }

    if (!response.ok) {
      throw new Error((await response.json()).error);
    }

    return response.json();
  }

  throw new Error("Max retries exceeded");
}

Best Practices

Use bulk endpoints -- Prefer POST /api/topics/reclassify (bulk) over individual reclassify calls
Use local classification when possible -- Matrix generation and import with useLLM: false use the local engine and are faster
Use filter-based catalog add -- POST /api/topics/catalog with filters is more efficient than adding topics by ID one at a time
Monitor your usage -- Check the admin dashboard regularly to avoid unexpected quota exhaustion
Process imports in order -- The chunked import pipeline is designed for sequential chunk processing; parallel chunk requests may cause conflicts

Next Steps

Authentication -- API key and SDK key management
Error Codes -- Complete error reference
Import API -- Import pipeline limits and chunking
Activations API -- Activation batch limits

Plan Tiers​

What Counts as a Classification​

Per-Endpoint Limits​

Classification Limits​

Import Limits​

Sync Limits​

Connection Limits​

Activation Limits​

Topic Limits​

Key Management Limits​

Export Limits​

Quota Exceeded Response​

Usage Tracking​

Checking Your Usage​

Handling Rate Limits​

Retry Strategy​

TypeScript Example​

Best Practices​

Next Steps​