Rate Limits & Quotas
AudienceGPT enforces per-organization quotas to ensure fair usage and protect shared infrastructure. Quotas are tracked monthly and apply across all authentication methods (Clerk session, API key, and SDK key). When a quota is exceeded, the API returns a 429 status code with details about the limit and reset timing.
Plan Tiers
Each organization is assigned a plan tier that determines its monthly quotas. The following tiers are available:
| Metric | Free | Pro | Enterprise |
|---|---|---|---|
| Monthly classifications | 100 | 10,000 | Unlimited |
| Monthly tokens (input + output) | 1,000,000 | 50,000,000 | Unlimited |
| Monthly import operations | 5 | 100 | Unlimited |
| Monthly exports | 10 | Unlimited | Unlimited |
All organizations currently default to the Enterprise tier with unlimited quotas. Plan assignment and billing integration are planned for a future release.
What Counts as a Classification
The following operations each consume one classification credit:
POST /api/classify-- Single topic classificationPOST /api/sdk/classify-- SDK classification- Each row processed in an import chunk (
POST /api/import/:batchId/chunk) - Each item processed in a sync page (
POST /api/connections/:id/sync/run/:runId/page) - Each topic in a reclassify request (
POST /api/topics/:id/reclassify,POST /api/topics/reclassify) - Each combination in matrix generation (
POST /api/topics/generate-matrix)
Per-Endpoint Limits
Beyond monthly quotas, individual endpoints enforce their own operational limits to protect system stability.
Classification Limits
| Limit | Value | Endpoint |
|---|---|---|
| Max messages per request | 50 | POST /api/classify, POST /api/sdk/classify |
| Max file size (brief upload) | 10 MB | POST /api/analyze-brief, POST /api/sdk/analyze-brief |
Import Limits
| Limit | Value | Description |
|---|---|---|
| Max rows per import | 50,000 | Total rows in a single import batch |
| Chunk size | 500 | Rows processed per chunk request |
| Max LLM classifications per chunk | 10 | AI-powered classifications capped to avoid route timeouts |
| Max retries per chunk | 3 | Client-side retry with exponential backoff |
| Chunk processing timeout | 60 seconds | Server-side max duration per chunk |
Sync Limits
| Limit | Value | Description |
|---|---|---|
| Default page size | 500 | Items fetched per sync page |
| Max pages per sync run | 300 | Maximum pages (150,000 items total) |
| Max LLM classifications per page | 15 | AI-powered classifications per sync page |
| Fetch timeout | 15 seconds | Timeout for external API page fetch |
| Max retries per page | 3 | Client-side retry with exponential backoff |
Connection Limits
| Limit | Value | Description |
|---|---|---|
| Max connections per org | 20 | Across all platforms and directions |
| Connection test timeout | 30 seconds | Max duration for POST /api/connections/:id/test |
| LiveRamp API timeout | 30 seconds | Per-request timeout for LiveRamp API calls |
| LiveRamp auth timeout | 10 seconds | OAuth token request timeout |
| Trade Desk API timeout | 30 seconds | Per-request timeout for Trade Desk API calls |
Activation Limits
| Limit | Value | Description |
|---|---|---|
| Push batch size | 25 | Segments pushed per batch request |
| Max push retries per segment | 3 | Retry attempts for failed platform pushes |
| Max errors per activation run | 500 | Error records stored per push run |
| Single deactivate timeout | 30 seconds | Max duration for single deactivation |
| Single refresh timeout | 30 seconds | Max duration for single refresh |
| Bulk deactivate timeout | 60 seconds | Max duration for bulk deactivation |
| Bulk refresh timeout | 60 seconds | Max duration for bulk refresh |
| Max bulk operations per request | 500 | Activation IDs in bulk deactivate/refresh |
Topic Limits
| Limit | Value | Description |
|---|---|---|
| Bulk reclassify (local) | 500 | Max topic IDs per bulk reclassify request |
| Bulk reclassify (LLM) | 100 | Max topic IDs when using AI-powered reclassify |
| Bulk delete | No hard limit | Array of topic IDs in DELETE request body |
| Catalog add by IDs | 500 | Max topic IDs per catalog add request |
| Matrix combinations | 5,000 | Max cartesian product size |
| Matrix minimum dimensions | 2 | At least 2 dimensions required |
Key Management Limits
| Limit | Value | Description |
|---|---|---|
| API keys per org | 25 | Maximum active API keys |
| SDK keys per org | No hard limit | Publishable keys for client-side use |
Export Limits
| Limit | Value | Description |
|---|---|---|
| Export timeout | 55 seconds | Safety timeout for streaming exports |
| Page size (keyset pagination) | 500 | Rows fetched per internal page |
Quota Exceeded Response
When a monthly quota is exceeded, the API returns:
HTTP/1.1 429 Too Many Requests
Content-Type: application/json
{
"error": "Monthly classifications quota exceeded",
"limit": 10000,
"used": 10000,
"remaining": 0,
"resetsAt": "2026-03-01T00:00:00.000Z"
}
| Field | Type | Description |
|---|---|---|
error | string | Human-readable error message |
limit | number | Monthly limit for this metric |
used | number | Current month's usage |
remaining | number | Remaining quota (always 0 when exceeded) |
resetsAt | string | ISO 8601 timestamp for when the quota resets (first of next month, UTC) |
Usage Tracking
AudienceGPT tracks usage per organization in the usage_daily_summary table with the following metrics:
| Metric | Description |
|---|---|
total_classifications | Classification operations (classify, import, sync, reclassify) |
total_tokens | Anthropic API tokens consumed (input + output) |
total_exports | Export operations |
total_web_searches | Web search tool uses during classification |
total_requests | Total API requests |
total_cost | Internal cost (Anthropic API charges) |
Usage data is aggregated daily and available through the admin dashboard for organization owners and super admins.
Checking Your Usage
Currently, usage is visible through the admin dashboard. A dedicated API endpoint for programmatic usage queries is planned.
Handling Rate Limits
Retry Strategy
When you receive a 429 response:
- Check the error message to determine which quota was exceeded
- For monthly quotas -- wait until the reset date or upgrade your plan
- For operational limits -- reduce batch sizes or add delays between requests
TypeScript Example
async function classifyWithRetry(
messages: Array<{ role: string; content: string }>,
maxRetries = 3
) {
for (let attempt = 0; attempt < maxRetries; attempt++) {
const response = await fetch("https://app.audiencegpt.com/api/classify", {
method: "POST",
headers: {
"Authorization": `Bearer ${API_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({ messages }),
});
if (response.status === 429) {
const error = await response.json();
if (error.resetsAt) {
// Monthly quota exceeded -- no point retrying
throw new Error(
`Monthly quota exceeded. Resets at ${error.resetsAt}`
);
}
// Transient rate limit -- retry with backoff
const delay = Math.pow(2, attempt) * 1000;
await new Promise((r) => setTimeout(r, delay));
continue;
}
if (!response.ok) {
throw new Error((await response.json()).error);
}
return response.json();
}
throw new Error("Max retries exceeded");
}
Best Practices
- Use bulk endpoints -- Prefer
POST /api/topics/reclassify(bulk) over individual reclassify calls - Use local classification when possible -- Matrix generation and import with
useLLM: falseuse the local engine and are faster - Use filter-based catalog add --
POST /api/topics/catalogwithfiltersis more efficient than adding topics by ID one at a time - Monitor your usage -- Check the admin dashboard regularly to avoid unexpected quota exhaustion
- Process imports in order -- The chunked import pipeline is designed for sequential chunk processing; parallel chunk requests may cause conflicts
Next Steps
- Authentication -- API key and SDK key management
- Error Codes -- Complete error reference
- Import API -- Import pipeline limits and chunking
- Activations API -- Activation batch limits