Skip to main content

Rate Limits & Quotas

AudienceGPT enforces per-organization quotas to ensure fair usage and protect shared infrastructure. Quotas are tracked monthly and apply across all authentication methods (Clerk session, API key, and SDK key). When a quota is exceeded, the API returns a 429 status code with details about the limit and reset timing.

Plan Tiers

Each organization is assigned a plan tier that determines its monthly quotas. The following tiers are available:

MetricFreeProEnterprise
Monthly classifications10010,000Unlimited
Monthly tokens (input + output)1,000,00050,000,000Unlimited
Monthly import operations5100Unlimited
Monthly exports10UnlimitedUnlimited
info

All organizations currently default to the Enterprise tier with unlimited quotas. Plan assignment and billing integration are planned for a future release.

What Counts as a Classification

The following operations each consume one classification credit:

  • POST /api/classify -- Single topic classification
  • POST /api/sdk/classify -- SDK classification
  • Each row processed in an import chunk (POST /api/import/:batchId/chunk)
  • Each item processed in a sync page (POST /api/connections/:id/sync/run/:runId/page)
  • Each topic in a reclassify request (POST /api/topics/:id/reclassify, POST /api/topics/reclassify)
  • Each combination in matrix generation (POST /api/topics/generate-matrix)

Per-Endpoint Limits

Beyond monthly quotas, individual endpoints enforce their own operational limits to protect system stability.

Classification Limits

LimitValueEndpoint
Max messages per request50POST /api/classify, POST /api/sdk/classify
Max file size (brief upload)10 MBPOST /api/analyze-brief, POST /api/sdk/analyze-brief

Import Limits

LimitValueDescription
Max rows per import50,000Total rows in a single import batch
Chunk size500Rows processed per chunk request
Max LLM classifications per chunk10AI-powered classifications capped to avoid route timeouts
Max retries per chunk3Client-side retry with exponential backoff
Chunk processing timeout60 secondsServer-side max duration per chunk

Sync Limits

LimitValueDescription
Default page size500Items fetched per sync page
Max pages per sync run300Maximum pages (150,000 items total)
Max LLM classifications per page15AI-powered classifications per sync page
Fetch timeout15 secondsTimeout for external API page fetch
Max retries per page3Client-side retry with exponential backoff

Connection Limits

LimitValueDescription
Max connections per org20Across all platforms and directions
Connection test timeout30 secondsMax duration for POST /api/connections/:id/test
LiveRamp API timeout30 secondsPer-request timeout for LiveRamp API calls
LiveRamp auth timeout10 secondsOAuth token request timeout
Trade Desk API timeout30 secondsPer-request timeout for Trade Desk API calls

Activation Limits

LimitValueDescription
Push batch size25Segments pushed per batch request
Max push retries per segment3Retry attempts for failed platform pushes
Max errors per activation run500Error records stored per push run
Single deactivate timeout30 secondsMax duration for single deactivation
Single refresh timeout30 secondsMax duration for single refresh
Bulk deactivate timeout60 secondsMax duration for bulk deactivation
Bulk refresh timeout60 secondsMax duration for bulk refresh
Max bulk operations per request500Activation IDs in bulk deactivate/refresh

Topic Limits

LimitValueDescription
Bulk reclassify (local)500Max topic IDs per bulk reclassify request
Bulk reclassify (LLM)100Max topic IDs when using AI-powered reclassify
Bulk deleteNo hard limitArray of topic IDs in DELETE request body
Catalog add by IDs500Max topic IDs per catalog add request
Matrix combinations5,000Max cartesian product size
Matrix minimum dimensions2At least 2 dimensions required

Key Management Limits

LimitValueDescription
API keys per org25Maximum active API keys
SDK keys per orgNo hard limitPublishable keys for client-side use

Export Limits

LimitValueDescription
Export timeout55 secondsSafety timeout for streaming exports
Page size (keyset pagination)500Rows fetched per internal page

Quota Exceeded Response

When a monthly quota is exceeded, the API returns:

HTTP/1.1 429 Too Many Requests
Content-Type: application/json
{
"error": "Monthly classifications quota exceeded",
"limit": 10000,
"used": 10000,
"remaining": 0,
"resetsAt": "2026-03-01T00:00:00.000Z"
}
FieldTypeDescription
errorstringHuman-readable error message
limitnumberMonthly limit for this metric
usednumberCurrent month's usage
remainingnumberRemaining quota (always 0 when exceeded)
resetsAtstringISO 8601 timestamp for when the quota resets (first of next month, UTC)

Usage Tracking

AudienceGPT tracks usage per organization in the usage_daily_summary table with the following metrics:

MetricDescription
total_classificationsClassification operations (classify, import, sync, reclassify)
total_tokensAnthropic API tokens consumed (input + output)
total_exportsExport operations
total_web_searchesWeb search tool uses during classification
total_requestsTotal API requests
total_costInternal cost (Anthropic API charges)

Usage data is aggregated daily and available through the admin dashboard for organization owners and super admins.

Checking Your Usage

Currently, usage is visible through the admin dashboard. A dedicated API endpoint for programmatic usage queries is planned.

Handling Rate Limits

Retry Strategy

When you receive a 429 response:

  1. Check the error message to determine which quota was exceeded
  2. For monthly quotas -- wait until the reset date or upgrade your plan
  3. For operational limits -- reduce batch sizes or add delays between requests

TypeScript Example

async function classifyWithRetry(
messages: Array<{ role: string; content: string }>,
maxRetries = 3
) {
for (let attempt = 0; attempt < maxRetries; attempt++) {
const response = await fetch("https://app.audiencegpt.com/api/classify", {
method: "POST",
headers: {
"Authorization": `Bearer ${API_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({ messages }),
});

if (response.status === 429) {
const error = await response.json();
if (error.resetsAt) {
// Monthly quota exceeded -- no point retrying
throw new Error(
`Monthly quota exceeded. Resets at ${error.resetsAt}`
);
}
// Transient rate limit -- retry with backoff
const delay = Math.pow(2, attempt) * 1000;
await new Promise((r) => setTimeout(r, delay));
continue;
}

if (!response.ok) {
throw new Error((await response.json()).error);
}

return response.json();
}

throw new Error("Max retries exceeded");
}

Best Practices

  • Use bulk endpoints -- Prefer POST /api/topics/reclassify (bulk) over individual reclassify calls
  • Use local classification when possible -- Matrix generation and import with useLLM: false use the local engine and are faster
  • Use filter-based catalog add -- POST /api/topics/catalog with filters is more efficient than adding topics by ID one at a time
  • Monitor your usage -- Check the admin dashboard regularly to avoid unexpected quota exhaustion
  • Process imports in order -- The chunked import pipeline is designed for sequential chunk processing; parallel chunk requests may cause conflicts

Next Steps