Rate limits and quotas
Every cap that applies to API traffic, the headers we send back, and how to back off.
The API has several different limits, each protecting a different thing. Most requests will only ever bump into the per-minute rate limit. The other caps mainly matter when you submit a lot of jobs at once.
Using the Python SDK?
A 429 from any rate-limit hit surfaces as RateLimitError. Read exc.retry_after for the wait the server advises.
Per-minute rate limit¶
How many requests you can make in any rolling 60-second window.
- Default: 60 requests per minute, per API key. Platform-wide; not tied to your subscription tier.
- A second cap of 2ร the per-key limit applies per user, so creating extra keys does not multiply your quota.
- Unauthenticated requests are limited per IP using the same default.
- Per-key overrides can be raised on request. Email support with the key prefix and the workload.
Concurrent submissions cap¶
How many of your submissions can be running at the same time.
- Default: 2 in-flight submissions per user, across all sources (web and API).
- The count goes up when a submission starts and back down when it finishes, across both API and web sources.
- This cap is independent of your tier's daily count: even on Premium, only this many runs can be active at once.
In-flight API requests cap¶
A separate gate on how many API submissions can be queued or running at once for a single user.
- Default: 5 in-flight API submissions per user (per-tier override possible).
- Counts only submissions created via the API (not web submissions).
- Mainly a safety valve so a runaway script cannot fill the queue.
Daily submissions quota¶
How many new submissions you can create in a rolling 24-hour window.
- Free tier: 5 submissions per day.
- Trial: 200 per day.
- Premium: 200 per day, sometimes higher per-user.
The window is rolling, not calendar-based: if you submit one job at 9:00 today, that one frees up at 9:00 tomorrow.
Per-tier parameter caps¶
Some submission parameters have tier-based ceilings checked at submission time. Values vary by tier; see Plans and tiers for the full comparison table.
num_generationsdefaults: Free 100, Trial 500, Premium 10000.num_genesdefaults: Free 10, Trial 100, Premium 1000.
These caps can be raised for a specific account on request. See Submission parameters for the full list of supported parameters.
Credit balance¶
Submissions also cost credits. Once your free + paid balance falls below the negative-balance cap (default $1 below zero, with tier-specific overrides), the API stops accepting new submissions until you top up.
See Credits for how credits work and how to top up.
Platform circuit breaker¶
If the heavy worker queue is full across the whole platform, new submissions are rejected with a platform at capacity message. This is rare and resolves on its own once the queue drains.
Headers on every response¶
Every response includes:
X-RateLimit-Limit, your per-minute cap.X-RateLimit-Remaining, how many requests you have left this window.X-RateLimit-Reset, Unix timestamp when the window resets.
What an exceeded limit looks like¶
Each cap has its own error code so your client can react differently.
| Limit hit | Status | Error code |
|---|---|---|
| Per-minute rate limit | 429 | rate_limited |
| Concurrent submissions cap | 429 | capacity_exceeded (reason concurrent_submissions) |
| In-flight API requests cap | 429 | too_many_active_api_requests |
| Daily submissions quota | 429 | capacity_exceeded (reason daily_quota) |
| Tier parameter cap (generations / genes) | 403 | tier_limit_exceeded |
| Credit balance below cap | 402 | insufficient_credits |
| Platform queue at capacity | 429 | capacity_exceeded (reason platform_at_capacity) |
A 429 rate_limited response from the per-minute throttle includes Retry-After and retry_after in the body:
HTTP/1.1 429 Too Many Requests
Retry-After: 17
Content-Type: application/json
{
"error": {
"code": "rate_limited",
"message": "Rate limit exceeded. Try again in 17 seconds.",
"details": {"retry_after": 17, "bucket": "key"}
}
}
Other 429 / 402 / 403 responses do not have Retry-After, because waiting will not help. Read the error code and act accordingly: top up credits for insufficient_credits, wait for an in-flight job to finish for capacity_exceeded, or reduce the request size for tier_limit_exceeded.
How to back off the per-minute rate limit¶
Wait at least retry_after seconds, then retry. If you keep hitting the limit, switch to exponential backoff:
delay = retry_after
while True:
response = requests.get(url, headers=headers)
if response.status_code != 429:
break
body = response.json()
delay = max(delay, body.get("error", {}).get("details", {}).get("retry_after", delay)) * 2
time.sleep(min(delay, 300))
Cap the delay at five minutes so you do not silently stall.
Reads vs writes¶
Reads (GET /api/v1/submissions, GET /api/v1/results) only hit the per-minute rate limit. They do not count against the daily quota, the concurrent cap, or the in-flight cap. Writes (POST /api/v1/submissions, POST /api/v1/results/{id}/continue) go through every gate above in that order.
Asking for a higher limit¶
Email support with the key prefix and the workload you are trying to run. We will look at your usage history and adjust if it makes sense.