Question 1

What are the Anthropic Claude API rate limits?

Accepted Answer

Anthropic's Claude API rate limits vary by tier. Free tier: 5 RPM (requests per minute), 25K tokens per minute. Build tier (credit-based): 50 RPM, 50K TPM for Opus, up to 2,000 RPM and 200K TPM for Haiku. Scale tier: 4,000 RPM, 400K TPM. Enterprise limits are custom and negotiable.

Question 2

How do OpenAI rate limits compare to Anthropic?

Accepted Answer

OpenAI uses a tiered system based on spending. Tier 1 ($5+ spent): 500 RPM for GPT-4o, 60K TPM. Tier 2 ($50+ spent): 5,000 RPM, 450K TPM. Tier 5 ($1,000+ spent): 10,000 RPM, 30M TPM. Anthropic's Build tier is comparable to OpenAI's Tier 1-2, while Anthropic Scale matches OpenAI's Tier 4-5.

Question 3

What happens when you hit an API rate limit?

Accepted Answer

Most AI APIs return HTTP 429 (Too Many Requests) with a Retry-After header indicating when to retry. OpenAI and Anthropic both use this pattern. Best practice is to implement exponential backoff: wait 1s, then 2s, then 4s between retries. Some APIs also return rate limit headers (X-RateLimit-Remaining) so you can proactively throttle.

Question 4

Which AI API has the most generous free tier?

Accepted Answer

Google Gemini offers the most generous free tier at 15 RPM for Gemini Pro and 1,500 requests per day. Anthropic's free tier offers 5 RPM with 25K TPM. OpenAI's free tier (when available) provides limited credits. Mistral offers a modest free tier. For hobby projects, Google Gemini's free tier provides the most headroom.

Question 5

How can I handle rate limiting in production applications?

Accepted Answer

Implement these strategies: (1) Token bucket or leaky bucket rate limiters in your client code, (2) Exponential backoff with jitter on 429 responses, (3) Request queuing to smooth burst traffic, (4) Caching responses for identical requests, (5) Load balancing across multiple API keys or providers. Stack Overflow's most-discussed approaches include OkHttp interceptors (7.8K views) and Retrofit queue delays (18.9K views).

API Rate Limit Comparison — Every Major AI API's Limits

Methodology

Key Findings

Rate Limit Strategies from Developer Community

Frequently Asked Questions

Related Tools

Provider	Model	Tier	RPM	TPM	RPD	HTTP on Limit	Retry Header
Anthropic	Claude Opus	Free	5	25,000	—	429	Retry-After
Anthropic	Claude Opus	Build	50	50,000	—	429	Retry-After
Anthropic	Claude Opus	Scale	4,000	400,000	—	429	Retry-After
Anthropic	Claude Sonnet	Free	5	25,000	—	429	Retry-After
Anthropic	Claude Sonnet	Build	1,000	100,000	—	429	Retry-After
Anthropic	Claude Sonnet	Scale	4,000	400,000	—	429	Retry-After
Anthropic	Claude Haiku	Free	5	25,000	—	429	Retry-After
Anthropic	Claude Haiku	Build	2,000	200,000	—	429	Retry-After
Anthropic	Claude Haiku	Scale	4,000	400,000	—	429	Retry-After
OpenAI	GPT-4o	Tier 1 ($5+)	500	30,000	—	429	Retry-After
OpenAI	GPT-4o	Tier 2 ($50+)	5,000	450,000	—	429	Retry-After
OpenAI	GPT-4o	Tier 5 ($1K+)	10,000	30,000,000	—	429	Retry-After
OpenAI	GPT-4o-mini	Tier 1	500	200,000	10,000	429	Retry-After
OpenAI	GPT-4o-mini	Tier 5	30,000	150,000,000	—	429	Retry-After
OpenAI	o1	Tier 1	500	30,000	—	429	Retry-After
Google	Gemini Pro	Free	15	32,000	1,500	429	Retry-After
Google	Gemini Pro	Pay-as-you-go	360	120,000	30,000	429	Retry-After
Google	Gemini Ultra	Pay-as-you-go	60	60,000	—	429	Retry-After
Mistral	Mistral Large	Free	1	500,000	—	429	Retry-After
Mistral	Mistral Large	Paid	5	2,000,000	—	429	Retry-After
Mistral	Mistral Small	Paid	5	2,000,000	—	429	Retry-After
Cohere	Command R+	Trial	20	—	1,000	429	X-RateLimit-Reset
Cohere	Command R+	Production	10,000	—	—	429	X-RateLimit-Reset
Cohere	Command R	Trial	20	—	1,000	429	X-RateLimit-Reset
Amazon	Bedrock (Claude)	On-Demand	Region-based	Region-based	—	429	Retry-After
Amazon	Bedrock (Claude)	Provisioned	Custom	Custom	—	429	Retry-After
Azure	OpenAI Service	Standard	Deployment-based	Deployment-based	—	429	Retry-After
Azure	OpenAI Service	Provisioned	PTU-based	PTU-based	—	429	Retry-After