Skip to main content
This section covers advanced billing features and optimizations.

Prompt Caching & Token Pricing

Some models support prompt caching, which can reduce costs when you reuse the same content.
How it works:
  • You can mark parts of your prompt to be cached
  • Cached content is stored for 5 minutes (or 1 hour with paid option)
  • When you reuse cached content, you pay less (cache reads are cheaper than regular input)

Token Pricing Structure

  • Base input tokens: Regular price per million tokens
  • Cache write (5 min): 1.25x base input price (25% more expensive)
  • Cache write (1 hour): 2x base input price (100% more expensive)
  • Cache read: 0.1x base input price (90% cheaper!)
  • Output tokens: Usually more expensive than input tokens
  • When you have large, repeated context (like a long document)
  • When you’re asking multiple questions about the same content
  • When you want to reduce costs for repeated prompts

Rate Limit Scaling

Your rate limits can scale based on your total purchased credits (permanent credits spending). The scaling applies a multiplier to your base tier rate limit:

Scaling Tiers

Total SpentScale TierMultiplierExample (Pro Tier)
$0-$99Base1x (no scaling)30 requests/min
$100+Tier 12x60 requests/min (30 × 2)
$250+Tier 23x90 requests/min (30 × 3)
$500+Tier 35x150 requests/min (30 × 5)
$1,000+Tier 47x210 requests/min (30 × 7)
$2,500+Tier 510x300 requests/min (30 × 10)
How it works:
  • Your base rate limit depends on your subscription tier (e.g., Pro tier = 30 requests/min)
  • The multiplier is applied to your base limit based on your total permanent credits spending
  • Example: If you’re on Pro tier (30 req/min) and have spent $500+ on permanent credits, your rate limit becomes 150 requests/min (30 × 5)

Base Rate Limits by Tier

  • Free: 5 requests/min
  • Starter: 10 requests/min
  • Plus: 15 requests/min
  • Core: 20 requests/min
  • Pro: 30 requests/min
  • Business: 60 requests/min
  • Enterprise: 100 requests/min
Note: Rate limit scaling applies to all tiers based on your total permanent credits spending. The multiplier is cumulative - you get the highest multiplier you qualify for.

Calculating Credit Costs

Cost Calculation

Credits are calculated based on:
  • Token-based models: Price per million tokens × tokens used
  • Fixed-price models: Fixed cost per request (like image generation)
Check the model pricing in the dashboard or API documentation for specific costs.

Example Calculation

  • Model costs $3 per million input tokens and $15 per million output tokens
  • You use 1,000 input tokens and 500 output tokens
  • Cost = (1,000 / 1,000,000 × $3) + (500 / 1,000,000 × $15) = $0.003 + $0.0075 = $0.0105 credits

Understanding Tokens

What are Tokens?Tokens are the units that AI models use to process text. When you send a message to an AI model:
  • Input tokens = the text you send (your prompt/question)
  • Output tokens = the text the AI generates (the response)
For example, if you ask “What is the weather?” and the AI responds “The weather is sunny today”, both your question and the response are counted as tokens. The cost of using a model depends on how many tokens you use.

Key Points

  • Tokens are priced per million tokens (MTok)
  • Different models have different token prices
  • Input tokens and output tokens are priced separately (output is usually more expensive)
  • Some models support prompt caching, which can reduce costs for repeated content