Prompt Caching & Token Pricing
Some models support prompt caching, which can reduce costs when you reuse the same content.How it works:
- You can mark parts of your prompt to be cached
- Cached content is stored for 5 minutes (or 1 hour with paid option)
- When you reuse cached content, you pay less (cache reads are cheaper than regular input)
Token Pricing Structure
- Base input tokens: Regular price per million tokens
- Cache write (5 min): 1.25x base input price (25% more expensive)
- Cache write (1 hour): 2x base input price (100% more expensive)
- Cache read: 0.1x base input price (90% cheaper!)
- Output tokens: Usually more expensive than input tokens
Rate Limit Scaling
Your rate limits can scale based on your total purchased credits (permanent credits spending). The scaling applies a multiplier to your base tier rate limit:Scaling Tiers
| Total Spent | Scale Tier | Multiplier | Example (Pro Tier) |
|---|---|---|---|
$0-$99 | Base | 1x (no scaling) | 30 requests/min |
$100+ | Tier 1 | 2x | 60 requests/min (30 × 2) |
$250+ | Tier 2 | 3x | 90 requests/min (30 × 3) |
$500+ | Tier 3 | 5x | 150 requests/min (30 × 5) |
$1,000+ | Tier 4 | 7x | 210 requests/min (30 × 7) |
$2,500+ | Tier 5 | 10x | 300 requests/min (30 × 10) |
How it works:
- Your base rate limit depends on your subscription tier (e.g., Pro tier = 30 requests/min)
- The multiplier is applied to your base limit based on your total permanent credits spending
- Example: If you’re on Pro tier (30 req/min) and have spent
$500+on permanent credits, your rate limit becomes 150 requests/min (30 × 5)
Base Rate Limits by Tier
- Free: 5 requests/min
- Starter: 10 requests/min
- Plus: 15 requests/min
- Core: 20 requests/min
- Pro: 30 requests/min
- Business: 60 requests/min
- Enterprise: 100 requests/min
Note: Rate limit scaling applies to all tiers based on your total permanent credits spending. The multiplier is cumulative - you get the highest multiplier you qualify for.
Calculating Credit Costs
Cost Calculation
Credits are calculated based on:
- Token-based models: Price per million tokens × tokens used
- Fixed-price models: Fixed cost per request (like image generation)
Example Calculation
- Model costs
$3per million input tokens and$15per million output tokens - You use 1,000 input tokens and 500 output tokens
- Cost = (1,000 / 1,000,000 ×
$3) + (500 / 1,000,000 ×$15) =$0.003+$0.0075=$0.0105credits
Understanding Tokens
What are Tokens?Tokens are the units that AI models use to process text. When you send a message to an AI model:
- Input tokens = the text you send (your prompt/question)
- Output tokens = the text the AI generates (the response)
For example, if you ask “What is the weather?” and the AI responds “The weather is sunny today”, both your question and the response are counted as tokens. The cost of using a model depends on how many tokens you use.
Key Points
- Tokens are priced per million tokens (MTok)
- Different models have different token prices
- Input tokens and output tokens are priced separately (output is usually more expensive)
- Some models support prompt caching, which can reduce costs for repeated content
