Cost Tracking

yoyo estimates the cost of each interaction so you can monitor spending.

Per-turn costs

After each response, you'll see a compact token summary:

  ↳ 3.2s · 1523→842 tokens · $0.0234

With --verbose (or -v), you get the full breakdown:

  tokens: 1523 in / 842 out  [cache: 1000 read, 500 write]  (session: 4200 in / 2100 out)  cost: $0.0234  total: $0.0567  ⏱ 3.2s

cost — estimated cost for this turn
total — estimated cumulative cost for the session

Quick cost check

Use /cost for a quick overview with a breakdown by cost category:

  Session cost: $0.0567
    4.2k in / 2.1k out
    cache: 1.0k read / 500 write

    Breakdown:
      input:       $0.0126
      output:      $0.0315
      cache write: $0.0031
      cache read:  $0.0005

Detailed breakdown

Use /tokens to see a full breakdown including cache usage:

  Session totals:
    input:       120.5k tokens
    output:      45.2k tokens
    cache read:  30.0k tokens
    cache write: 15.0k tokens
    est. cost:   $0.892

Supported models

Costs are estimated based on published pricing for all major providers:

Anthropic

Model	Input	Cache Write	Cache Read	Output
Opus 4.5/4.6	$5/MTok	$6.25/MTok	$0.50/MTok	$25/MTok
Opus 4/4.1	$15/MTok	$18.75/MTok	$1.50/MTok	$75/MTok
Sonnet	$3/MTok	$3.75/MTok	$0.30/MTok	$15/MTok
Haiku 4.5	$1/MTok	$1.25/MTok	$0.10/MTok	$5/MTok
Haiku 3.5	$0.80/MTok	$1/MTok	$0.08/MTok	$4/MTok

OpenAI

Model	Input	Output
GPT-4.1	$2/MTok	$8/MTok
GPT-4.1 Mini	$0.40/MTok	$1.60/MTok
GPT-4.1 Nano	$0.10/MTok	$0.40/MTok
GPT-4o	$2.50/MTok	$10/MTok
GPT-4o Mini	$0.15/MTok	$0.60/MTok
o3	$2/MTok	$8/MTok
o3-mini	$1.10/MTok	$4.40/MTok
o4-mini	$1.10/MTok	$4.40/MTok

Google

Model	Input	Output
Gemini 2.5 Pro	$1.25/MTok	$10/MTok
Gemini 2.5 Flash	$0.15/MTok	$0.60/MTok
Gemini 2.0 Flash	$0.10/MTok	$0.40/MTok

DeepSeek

Model	Input	Output
DeepSeek Chat/V3	$0.27/MTok	$1.10/MTok
DeepSeek Reasoner/R1	$0.55/MTok	$2.19/MTok

Mistral

Model	Input	Output
Mistral Large	$2/MTok	$6/MTok
Mistral Small	$0.10/MTok	$0.30/MTok
Codestral	$0.30/MTok	$0.90/MTok

xAI (Grok)

Model	Input	Output
Grok 3	$3/MTok	$15/MTok
Grok 3 Mini	$0.30/MTok	$0.50/MTok
Grok 2	$2/MTok	$10/MTok

Groq (hosted models)

Model	Input	Output
Llama 3.3 70B	$0.59/MTok	$0.79/MTok
Llama 3.1 8B	$0.05/MTok	$0.08/MTok
Mixtral 8x7B	$0.24/MTok	$0.24/MTok
Gemma2 9B	$0.20/MTok	$0.20/MTok

MTok = million tokens.

OpenRouter

Models accessed through OpenRouter (e.g., anthropic/claude-sonnet-4-20250514) are automatically recognized — the provider prefix is stripped before matching.

Limitations

Cost estimates are approximate — actual billing may differ slightly
For unrecognized models, no cost estimate is shown
Cache read/write costs only apply to Anthropic models; other providers show zero cache costs
Pricing may change — check your provider's pricing page for the latest rates

Keeping costs down

Use smaller models (Haiku, Sonnet, GPT-4.1 Mini, Gemini Flash) for simple tasks
Use /compact to reduce context size (fewer input tokens per turn)
Use single-prompt mode (-p) for quick questions to avoid accumulating context
Turn off extended thinking for routine tasks

yoyo documentation