yoyo

yoyo is a coding agent that runs in your terminal. It can read and edit files, execute shell commands, search codebases, and manage git workflows — all through natural language.

yoyo is open-source, written in Rust, and built on yoagent. It started as ~200 lines and evolves itself one commit at a time.

What yoyo can do

  • Read and edit files — view file contents, make surgical edits, or write new files
  • Run shell commands — execute anything you'd type in a terminal
  • Search codebases — grep across files with regex support
  • Navigate projects — list directories, understand project structure
  • Track context — monitor token usage, auto-compact when the context window fills up
  • Persist sessions — save and resume conversations across sessions
  • Estimate costs — see per-turn and session-total cost estimates

Quick example

export ANTHROPIC_API_KEY=sk-ant-...
cargo install yoyo-agent  # or: cargo run from source

yoyo

Then just talk to it:

> read src/main.rs and find any unwrap() calls that could panic
> fix the bug in parse_config and run the tests
> explain what this codebase does

What makes yoyo different

yoyo is not a product — it's a process. It evolves itself in public. Every improvement is a git commit. Every session is journaled. You can read its source code, its journal, and its identity.

Current version: v0.1.4

Installation

Requirements

  • Rust toolchain — install from rustup.rs
  • An API key — from any supported provider (see Providers below)

Install from crates.io

cargo install yoyo-agent

This installs the binary as yoyo in your PATH.

Install from source

git clone https://github.com/yologdev/yoyo-evolve.git
cd yoyo-evolve
cargo build --release

The binary will be at target/release/yoyo.

Run directly with Cargo

If you just want to try it:

cd yoyo-evolve
ANTHROPIC_API_KEY=sk-ant-... cargo run

Providers

yoyo supports multiple AI providers out of the box. Use the --provider flag to select one:

ProviderFlagDefault ModelEnv Var
Anthropic (default)--provider anthropicclaude-opus-4-6ANTHROPIC_API_KEY
OpenAI--provider openaigpt-4oOPENAI_API_KEY
Google/Gemini--provider googlegemini-2.0-flashGOOGLE_API_KEY
OpenRouter--provider openrouteranthropic/claude-sonnet-4-20250514OPENROUTER_API_KEY
xAI--provider xaigrok-3XAI_API_KEY
Groq--provider groqllama-3.3-70b-versatileGROQ_API_KEY
DeepSeek--provider deepseekdeepseek-chatDEEPSEEK_API_KEY
Mistral--provider mistralmistral-large-latestMISTRAL_API_KEY
Cerebras--provider cerebrasllama-3.3-70bCEREBRAS_API_KEY
Ollama--provider ollamallama3.2(none needed)
Custom--provider custom(none)(none needed)

Ollama and custom providers don't require an API key. yoyo will automatically connect to http://localhost:11434/v1 for Ollama or http://localhost:8080/v1 for custom providers. Override the endpoint with --base-url.

Examples:

# Anthropic (default)
ANTHROPIC_API_KEY=sk-ant-... yoyo

# OpenAI
OPENAI_API_KEY=sk-... yoyo --provider openai

# Google Gemini
GOOGLE_API_KEY=... yoyo --provider google

# Local Ollama (no API key needed)
yoyo --provider ollama --model llama3.2

# Custom OpenAI-compatible endpoint
yoyo --provider custom --base-url http://localhost:8080/v1 --model my-model

Set your API key

yoyo resolves your API key in this order:

  1. --api-key CLI flag (highest priority)
  2. Provider-specific environment variable (e.g., OPENAI_API_KEY for --provider openai)
  3. ANTHROPIC_API_KEY environment variable (fallback)
  4. API_KEY environment variable (generic fallback)
  5. api_key in config file (see below)

Set one of them:

# Via environment variable (recommended)
export ANTHROPIC_API_KEY=sk-ant-api03-...

# Or pass directly
yoyo --api-key sk-ant-api03-...

If no key is found via any method (and the provider requires one), yoyo will exit with an error message explaining what to do.

Config file

yoyo supports a TOML-style config file so you don't have to pass flags every time. Config files are checked in this order (first found wins):

  1. .yoyo.toml in the current directory (project-level)
  2. ~/.yoyo.toml (home directory shorthand)
  3. ~/.config/yoyo/config.toml (XDG user-level)

Example .yoyo.toml:

# Model and provider
model = "claude-sonnet-4-20250514"
provider = "anthropic"
thinking = "medium"

# API key (env vars take priority over this)
api_key = "sk-ant-api03-..."

# Generation settings
max_tokens = 8192
max_turns = 50
temperature = 0.7

# Custom endpoint (for ollama, proxies, etc.)
# base_url = "http://localhost:11434/v1"

# Permission rules for bash commands
[permissions]
allow = ["git *", "cargo *", "echo *"]
deny = ["rm -rf *", "sudo *"]

# Directory restrictions for file tools
[directories]
allow = ["./src", "./tests"]
deny = ["~/.ssh", "/etc"]

CLI flags always override config file values. For example, --model gpt-4o overrides model = "claude-sonnet-4-20250514" from the config file.

For more details on model configuration, see Models. For thinking levels, see Thinking.

Quick Start

Once installed, start yoyo:

export ANTHROPIC_API_KEY=sk-ant-...
yoyo

Or pass the API key directly:

yoyo --api-key sk-ant-...

First time? If you run yoyo without an API key, an interactive setup wizard walks you through choosing a provider, entering your API key, picking a model, and optionally saving a .yoyo.toml config file. After setup, you go straight into the REPL — no restart needed. You can also run the wizard anytime with yoyo setup. If you prefer to skip it, set your API key environment variable first or press Ctrl+C to cancel.

You'll see a banner like this:

  yoyo v0.1.4 — a coding agent growing up in public
  Type /help for commands, /quit to exit

  model: claude-opus-4-6
  git:   main
  cwd:   /home/user/project

Your first prompt

Type a natural language request:

main > explain what this project does

yoyo will read files, run commands, and respond. You'll see tool executions as they happen:

  ▶ read README.md ✓
  ▶ ls src/ ✓
  ▶ read src/main.rs ✓

This project is a...

Common tasks

Read and explain code:

> read src/main.rs and explain the main function

Make changes:

> add error handling to the parse_config function in src/config.rs

Run commands:

> run the tests and fix any failures

Search a codebase:

> find all TODO comments in this project

Exiting

Type /quit, /exit, or press Ctrl+D.

Interactive Mode (REPL)

Interactive mode is the default when you run yoyo in a terminal. It gives you a read-eval-print loop where you can have a multi-turn conversation with the agent.

Starting

yoyo
# or
cargo run

The prompt

The prompt shows your current git branch (if you're in a git repo):

main 🐙 › _

If you're not in a git repo, you get a plain prompt:

🐙 › _

Line editing & history

yoyo uses rustyline for a full readline experience:

  • Arrow keys: Navigate within the current line (← →) and through command history (↑ ↓)
  • Inline hints: As you type a slash command, a dimmed suggestion appears after the cursor showing the completion and a short description — e.g. typing /he shows lp — Show help for commands. Press Tab or → to accept.
  • Tab completion: Type / and press Tab to see available slash commands with descriptions — each command is shown alongside a short summary of what it does. Partial matches work too — /he<Tab> suggests /help and /health. After typing a command + space, argument-aware completions kick in:
    • /model <Tab> — suggests known model names (Claude, GPT, Gemini, etc.)
    • /provider <Tab> — suggests known provider names (anthropic, openai, google, etc.)
    • /think <Tab> — suggests thinking levels (off, minimal, low, medium, high)
    • /git <Tab> — suggests git subcommands (status, log, add, diff, branch, stash)
    • /pr <Tab> — suggests PR subcommands (list, view, diff, comment, create, checkout)
    • /save <Tab> and /load <Tab> — suggest .json session files in the current directory
    • File paths also complete — type src/ma<Tab> to get src/main.rs, or Cargo<Tab> to get Cargo.toml. Directories complete with a trailing / for easy continued navigation.
  • History recall: Previous inputs are saved across sessions
  • Keyboard shortcuts: Ctrl-A (start of line), Ctrl-E (end of line), Ctrl-K (kill to end), Ctrl-W (delete word back)
  • History file: Stored at $XDG_DATA_HOME/yoyo/history (defaults to ~/.local/share/yoyo/history)

How it works

  1. You type a message
  2. yoyo sends it to the LLM along with conversation history
  3. The LLM may call tools (read files, run commands, etc.)
  4. Tool results are streamed back — you see each tool as it executes
  5. The final text response is printed
  6. Token usage and cost are shown after each turn

Tool output

When yoyo uses tools, you'll see status indicators:

  ▶ $ cargo test ✓ (2.1s)
  ▶ read src/main.rs ✓ (42ms)
  ▶ edit src/lib.rs ✓ (15ms)
  ▶ $ cargo test ✗ (1.8s)
  • means the tool succeeded
  • means the tool returned an error
  • The duration shows how long the tool took

Token usage

After each response, you'll see a compact token summary:

  ↳ 3.2s · 1523→842 tokens · $0.0234

Use --verbose (or -v) for the full breakdown including session totals and cache info.

This shows:

  • Wall-clock time for the response
  • Input→output tokens for this turn
  • Estimated cost for this turn

Interrupting

Press Ctrl+C to cancel the current response. The agent will stop and you can type a new prompt. Press Ctrl+C again to exit.

Inline @file mentions

You can reference files directly in your prompts using @path syntax. The file content is automatically read and injected into the conversation — no need for a separate /add command.

> explain @src/main.rs
  ✓ added src/main.rs (250 lines)
  (1 file inlined from @mentions)

> refactor @src/cli.rs:50-100
  ✓ added src/cli.rs (lines 50-100) (51 lines)
  (1 file inlined from @mentions)

> compare @Cargo.toml and @README.md
  ✓ added Cargo.toml (35 lines)
  ✓ added README.md (120 lines)
  (2 files inlined from @mentions)

How it works:

  • @path — injects the entire file
  • @path:start-end — injects a specific line range
  • If the path doesn't exist, the @mention is left as-is (it might be a username)
  • Email-like patterns (user@example.com) are not treated as file mentions
  • Images work too: @screenshot.png inlines the image into the conversation

Single-Prompt Mode

Use --prompt or -p to run a single prompt without entering the REPL. yoyo will process the prompt, print the response, and exit.

Usage

yoyo --prompt "explain this codebase"
yoyo -p "find all TODO comments"

When to use it

Single-prompt mode is useful for:

  • Scripting — run yoyo as part of a larger workflow
  • Quick questions — get an answer without starting a session
  • CI/CD pipelines — automate code review or analysis

Example

$ yoyo -p "count the lines of Rust code in this project"
  ▶ $ find . -name '*.rs' | xargs wc -l ✓ (0.1s)

There are 1,475 lines of Rust code across 1 file (src/main.rs).

Combining with other flags

You can combine -p with other flags:

yoyo -p "review this diff" --model claude-sonnet-4-20250514
yoyo -p "explain the architecture" --thinking high
yoyo -p "analyze the code" --system "You are a security auditor."

Piped Mode

When stdin is not a terminal (i.e., input is piped), yoyo reads all of stdin as a single prompt, processes it, and exits. This works like single-prompt mode but takes input from a pipe instead of a flag.

Usage

echo "explain this code" | yoyo
cat prompt.txt | yoyo
git diff | yoyo

When to use it

Piped mode is useful for:

  • Passing file contents as part of the prompt
  • Chaining with other commands in a pipeline
  • Feeding structured input from scripts

Examples

Review a git diff:

git diff HEAD~1 | yoyo --system "Review this diff for bugs."

Analyze a file:

cat src/main.rs | yoyo --system "Find all potential panics in this Rust code."

Process command output:

cargo test 2>&1 | yoyo --system "Explain these test failures and suggest fixes."

Detection

yoyo detects piped mode automatically by checking if stdin is a terminal. If it is not, piped mode activates. If stdin is a terminal, interactive REPL mode starts instead.

If piped input is empty, yoyo exits with an error: No input on stdin.

Slash commands aren't dispatched in piped mode

Slash commands (/doctor, /status, /help, etc.) belong to the interactive REPL — they depend on REPL state that piped mode doesn't have. If you pipe a slash command into yoyo, it won't run it; it would only get sent to the model as a literal string and waste a turn of tokens.

Instead, yoyo detects this case, prints a one-line warning to stderr, and exits with status code 2. Use one of these alternatives:

yoyo doctor                       # run the subcommand directly
yoyo --prompt "/doctor"           # send the literal text to the agent
yoyo                              # interactive REPL

REPL Commands

All commands start with /. Type /help inside yoyo to see the full list.

Note: A few commands are also available as shell subcommands — run them directly without entering the REPL:

SubcommandDescription
yoyo helpShow help message (same as --help)
yoyo versionShow version (same as --version)
yoyo setupRun the interactive setup wizard
yoyo initGenerate a YOYO.md project context file
yoyo doctorDiagnose yoyo setup (config file, API key, provider, tool availability)
yoyo healthRun project health checks (build, test, clippy, fmt — auto-detects project type)
yoyo lintRun project linter (e.g. yoyo lint --strict, yoyo lint unsafe)
yoyo testRun project test suite
yoyo treeShow project directory tree
yoyo mapShow project symbol map
yoyo runRun a shell command (e.g. yoyo run cargo clippy)
yoyo diffShow git diff (e.g. yoyo diff --staged)
yoyo commitCommit staged changes (e.g. yoyo commit "fix typo")
yoyo reviewShow review prompt for staged changes or a file
yoyo blameShow git blame (e.g. yoyo blame src/main.rs:1-20)
yoyo grepSearch files for a pattern (e.g. yoyo grep TODO src/)
yoyo findFind files by name (e.g. yoyo find main)
yoyo indexBuild and display project index
yoyo updateCheck for and install the latest yoyo release
yoyo docsLook up docs.rs documentation (e.g. yoyo docs serde)
yoyo watchToggle watch mode (e.g. yoyo watch cargo test)
yoyo statusShow version, git branch, and working directory
yoyo undoUndo changes (e.g. yoyo undo --last-commit)

doctor honors --provider and --model if you want to point it at a non-default setup (e.g. yoyo doctor --provider openai). Inside the REPL, the same checks are available as /doctor and /health.

CommandDescription
/quit, /exitExit yoyo
/helpShow available commands
/help <command>Show detailed help for a specific command

Conversation

CommandDescription
/clearClear conversation history and start fresh
/compactCompress conversation to save context space (see Context Management)
/retryRe-send your last input — useful when a response gets cut off or you want to try again
/historyShow a summary of all messages in the conversation
/search <query>Search conversation history for messages containing the query (case-insensitive)
/mark <name>Bookmark the current conversation state
/jump <name>Restore conversation to a bookmark (discards messages after it)
/marksList all saved bookmarks

Conversation bookmarks

The /mark and /jump commands let you bookmark points in your conversation and return to them later. This is useful when exploring different approaches — bookmark a good state, try something, and jump back if it doesn't work out.

> /mark before-refactor
  ✓ bookmark 'before-refactor' saved (12 messages)

> ... try something risky ...

> /jump before-refactor
  ✓ jumped to bookmark 'before-refactor' (12 messages)

> /marks
  Saved bookmarks:
    • before-refactor

Bookmarks are stored in memory for the current session. Overwriting a bookmark with the same name updates it. Jumping to a bookmark restores the conversation to exactly that point — any messages added after the bookmark are discarded.

Model, Provider & Thinking

CommandDescription
/model <name>Switch to a different model (preserves conversation)
/provider <name>Switch provider and reset model to the provider's default
/think [level]Show or change thinking level: off, minimal, low, medium, high
/teach [on|off]Toggle teach mode — yoyo explains its reasoning as it works

Examples:

/model claude-sonnet-4-20250514
/provider openai
/provider google
/think high
/think off

The /model command preserves conversation when switching models. The /provider command switches to a different API provider (e.g., anthropic, openai, google, openrouter, ollama, xai, groq, deepseek, mistral, cerebras, custom) and automatically sets the model to the provider's default. Use /provider without arguments to see the current provider and available options. The /think command adjusts the thinking level.

The /teach command toggles teach mode on or off. When teach mode is active, yoyo explains why it's making each change before showing code, uses clear and readable patterns, adds comments on non-obvious lines, and summarizes what you should learn after completing a task. Great for learning while the agent codes. This is a session-only toggle — it resets when you exit.

Session

CommandDescription
/save [path]Save conversation to a file (default: yoyo-session.json)
/load [path]Load conversation from a file (default: yoyo-session.json)

See Session Persistence for details.

Information

CommandDescription
/statusShow current model, git branch, working directory, and session token totals
/tokensShow detailed token usage: context window fill level, session totals, and estimated cost
/costShow estimated session cost
/changelog [N]Show recent git commit history (default: 15, max: 100)
/configShow all current settings
/config showShow loaded config file path and merged key-value pairs (secrets masked)
/config editOpen config file in $EDITOR
/hooksShow active hooks (pre/post tool execution)
/permissionsShow active security and permission configuration
/versionShow yoyo version

The /tokens command shows a visual progress bar of your active context:

  Active context:
    messages:    12
    current:     45.2k / 200.0k tokens
    █████████░░░░░░░░░░░ 23%

Documentation

CommandDescription
/docs <crate>Look up docs.rs documentation for a Rust crate
/docs <crate> <item>Look up a specific module/item within a crate

The /docs command fetches the docs.rs page for a given crate and shows a quick summary — confirming the crate exists, displaying its description, and listing the crate's API items (modules, structs, traits, enums, functions, macros). No tokens used, no AI involved.

Each category is capped at 10 items with a "+N more" suffix for large crates.

/docs serde
  ✓ serde
  📦 https://docs.rs/serde/latest/serde/
  📝 A generic serialization/deserialization framework

  Modules: de, ser
  Traits: Deserialize, Deserializer, Serialize, Serializer
  Macros: forward_to_deserialize_any

/docs tokio task
  ✓ tokio::task
  📦 https://docs.rs/tokio/latest/tokio/task/
  📝 Asynchronous green-threads...

Shell

CommandDescription
/run <cmd>Run a shell command directly — no AI, no tokens used
!<cmd>Shortcut for /run
/bg [subcmd]Manage background shell processes
/web <url>Fetch a web page and display clean readable text content

The /run command (or ! shortcut) lets you execute shell commands without going through the AI model. Useful for quick checks (e.g., !git log --oneline -5) without burning API tokens.

/run ls -la src/
/run cargo test
/run git status

/bg — Background process management

The /bg command lets you launch shell commands in the background, monitor their output, and kill them when done. Useful for long-running tasks like builds, test suites, or dev servers.

SubcommandDescription
/bg run <cmd>Launch a command in the background
/bg listShow all background jobs (default when no subcommand)
/bg output <id>Show last 50 lines of a job's output
/bg output <id> --allShow all captured output
/bg kill <id>Kill a running job
/bg run cargo build --release
  ⚡ Background job [1] started: cargo build --release

/bg list
  Background Jobs
    [1]  ● running  12s  cargo build --release

/bg output 1
  ... (last 50 lines of build output)

/bg kill 1
  Killed job [1]

Output is capped at 256KB per job to prevent memory issues. Jobs display colored status: green for success, red for failure, yellow for running.

/web — Fetch and read web pages

The /web command fetches a URL and extracts readable text content, stripping away HTML tags, scripts, styles, and navigation. This is useful for quickly pulling in documentation, error explanations, API references, or any web content without getting raw HTML.

/web https://doc.rust-lang.org/book/ch01-01-installation.html
/web docs.rs/serde
/web https://stackoverflow.com/questions/12345

Features:

  • Auto-prepends https:// if you omit the protocol — /web docs.rs/serde works
  • Strips noise — removes <script>, <style>, <nav>, <footer>, <header>, and <svg> blocks
  • Converts structure — headings become prominent, list items get bullets, block elements get newlines
  • Decodes entities&amp;, &lt;, &gt;, &#NNN;, &nbsp;, etc.
  • Truncates — caps output at ~5,000 characters to keep it readable
  • No AI tokens used — pure curl + text extraction

Subagent & Planning

CommandDescription
/plan <task>Create a step-by-step plan for a task without executing anything (architect mode)
/spawn <task>Spawn a subagent with a fresh context to handle a task

/plan — Architect mode

The /plan command asks the AI to create a detailed, structured plan for a task without executing any tools. This is the "architect mode" equivalent — you see exactly what the agent intends to do before it does anything.

> /plan add caching to the database layer

  📋 Planning: add caching to the database layer

  ## Files to examine
  - src/db.rs — current database implementation
  - src/config.rs — configuration for cache TTL

  ## Files to modify
  - src/db.rs — add cache layer
  - src/cache.rs — new file for cache implementation
  - tests/cache_test.rs — new tests

  ## Step-by-step approach
  1. Read src/db.rs to understand current query patterns
  2. Create src/cache.rs with an LRU cache struct
  3. Wrap database queries with cache lookups
  4. Add cache invalidation on writes
  5. Add configuration for cache size and TTL

  ## Tests to write
  - Cache hit returns cached value
  - Cache miss falls through to database
  - Write invalidates relevant cache entries

  ## Potential risks
  - Cache invalidation on complex queries
  - Memory pressure with large result sets

  ## Verification
  - Run existing tests to ensure no regressions
  - Run new cache tests
  - Benchmark query latency before/after

  💡 Review the plan above. Say "go ahead" to execute it, or refine it.

After reviewing the plan, you can:

  • Say "go ahead" to have the agent execute the plan
  • Ask the agent to refine specific parts ("make the cache configurable")
  • Modify the approach ("use Redis instead of in-memory")
  • Say "no" or change direction entirely

This is especially useful for:

  • Large refactors where you want to understand the scope before committing
  • Unfamiliar codebases where you want the agent to map things out first
  • Trust and transparency — see the full plan before any files are modified
  • Teaching moments — the plan itself teaches you about the codebase structure

/spawn — Subagent

The /spawn command creates a fresh AI agent with its own independent context window, sends it your task, runs it to completion, and injects the result back into your main conversation.

This is useful for tasks that would consume a lot of context in your main session — reading large files, multi-step analysis, exploring unfamiliar code — without polluting your primary conversation history.

/spawn read all files in src/ and summarize the architecture
/spawn find all TODO comments in the codebase and list them
/spawn analyze the test coverage and suggest gaps

The subagent has access to the same tools (bash, file operations, etc.) and uses the same model. Its token usage counts toward your session total, but its context is completely separate from your main conversation. When it finishes, a summary of the task and result is injected into your main conversation so you have awareness of what was done.

Automatic sub-agent delegation: In addition to /spawn, the model can autonomously delegate subtasks to a built-in sub_agent tool. This happens transparently — the model decides when a subtask benefits from a fresh context window (e.g., researching a codebase section, running a series of tests). You'll see a 🐙 indicator when delegation occurs.

Git

CommandDescription
/git statusShow working tree status (git status --short) — quick shortcut
/git log [n]Show last n commits (default: 5) via git log --oneline
/git add <path>Stage files for commit
/git stashStash uncommitted changes
/git stash popRestore stashed changes
/git stash listList all stash entries with colored output
/git stash show [n]Show diff of stash entry (default: latest)
/git stash drop [n]Drop a stash entry (default: latest)
/commit [msg]Commit staged changes — generates a conventional commit message if no msg provided
/diffShow colored file summary, change stats, and full diff of uncommitted changes
/blame <file>Show colorized git blame output (/blame file:10-20 for line ranges)
/undoRevert all uncommitted changes (git checkout -- . and git clean -fd)
/pr [number]List open PRs (gh pr list), or view a specific PR (gh pr view <number>)
/pr create [--draft]Create a PR with an AI-generated title and description
/pr <number> diffShow the diff of a PR (gh pr diff <number>)
/pr <number> comment <text>Add a comment to a PR (gh pr comment <number>)
/pr <number> checkoutCheckout a PR branch locally (gh pr checkout <number>)
/healthRun project health checks — auto-detects project type, reports pass/fail with timing
/testAuto-detect and run project tests — shows output with timing
/lintAuto-detect and run project linter — shows output with timing, feeds failures to agent context
/lint pedanticRun with pedantic clippy lints (Rust only)
/lint strictRun with pedantic + nursery clippy lints (Rust only)
/lint fixRun linter and auto-send failures to AI for fixing
/lint unsafeScan for unsafe code blocks and suggest safety attributes (Rust only)
/fixAuto-fix build/lint errors — runs health checks, sends failures to the AI agent for fixing
/updateSelf-update yoyo to the latest GitHub release — detects platform, downloads, replaces the binary

The /git command is a convenience wrapper for common git operations without burning AI tokens or using /run git .... For example:

/git status          # instead of /run git status --short
/git log 10          # instead of /run git log --oneline -10
/git add src/main.rs # stage a file
/git stash           # stash changes
/git stash pop       # restore stash
/git stash list      # see all stash entries
/git stash show 1    # view diff of stash@{1}
/git stash drop 0    # drop the latest stash

The /commit command helps you commit staged changes quickly:

  • /commit (no arguments): reads your staged diff, generates a conventional commit message (e.g., feat(main): add changes), and asks for confirmation — press y to accept, n to cancel, or e to edit
  • /commit fix: typo in README: commits directly with your provided message
  • If nothing is staged, it reminds you to git add first

The /undo command shows you what will be reverted before doing it.

The /pr command is a quick wrapper around the GitHub CLI:

  • /pr — list the 10 most recent open pull requests
  • /pr create — create a PR with an AI-generated title and description from your branch's diff and commits
  • /pr create --draft — same, but as a draft PR
  • /pr 42 — view details of PR #42
  • /pr 42 diff — show the diff for PR #42
  • /pr 42 comment looks good! — add a comment to PR #42
  • /pr 42 checkout — checkout PR #42's branch locally

For merging or closing PRs, use /run gh pr ... or ask the agent directly — it has full bash access.

The /health command auto-detects your project type by looking for marker files and runs the appropriate checks:

  • Rust (Cargo.toml): cargo build, cargo test, cargo clippy, cargo fmt --check
  • Node.js (package.json): npm test, npx eslint .
  • Python (pyproject.toml, setup.py, setup.cfg): pytest, flake8, mypy
  • Go (go.mod): go build, go test, go vet
  • Makefile (Makefile): make test

If no recognized project type is found, it shows a helpful message listing the marker files it looked for.

The /test command is a focused shortcut that only runs the test suite for your project (e.g., cargo test, npm test, python -m pytest, go test ./..., make test). It auto-detects the project type the same way /health does, but runs just the tests — with full output and timing. This is handy for a quick test run without the full suite of lint/build checks that /health performs.

The /lint command is similar to /test but runs only the linter for your project. It auto-detects the project type and runs the appropriate linter:

  • Rust: cargo clippy --all-targets -- -D warnings
  • Node.js: npx eslint .
  • Python: ruff check .
  • Go: golangci-lint run

For Rust projects, you can increase clippy's strictness:

  • /lint pedantic — adds -W clippy::pedantic for stricter style checks
  • /lint strict — adds -W clippy::pedantic -W clippy::nursery for maximum analysis

Strictness levels only affect Rust projects; other languages use their default linter regardless.

When lint fails, the error output is automatically fed into the agent context so you can ask the AI about the errors in your next message. For fully automated fixing, use /lint fix — this runs the linter and, if there are failures, sends them directly to the AI agent for correction (similar to /fix but lint-only).

The /fix command goes one step further than /health — it runs the same health checks, but when any check fails, it sends the full error output to the AI agent with a prompt to fix the issues. The AI reads the relevant files, understands the errors, and applies fixes using its tools. After fixing, it re-runs the checks to verify. This is particularly useful for quickly resolving lint warnings, format issues, or build errors.

/fix
  Detected project: Rust (Cargo)
  Running health checks...
  ✓ build: ok
  ✗ clippy: FAIL
  ✓ fmt: ok

  Sending 1 failure(s) to AI for fixing...

/update — Self-update to latest release

The /update command checks GitHub for the latest release and downloads the new binary in-place.

/update
  Update available: v0.1.5 → v0.2.0
  This will download and replace the current binary.
  Continue? [y/N] y
  Downloading yoyo-x86_64-unknown-linux-gnu.tar.gz...
  ✓ Updated to v0.2.0! Please restart yoyo to use the new version.

The command:

  • Detects your platform (Linux x86_64, macOS Intel/ARM, Windows x86_64)
  • Creates a backup of the current binary before replacing
  • Restores the backup if anything goes wrong
  • Suggests manual install instructions as a fallback

If you're running a development build (from cargo build), it will suggest using cargo install yoyo-agent instead.

Code Review

CommandDescription
/reviewAI-powered review of staged changes (falls back to unstaged if nothing staged)
/review <path>AI-powered review of a specific file

The /review command sends your code to the AI for a thorough review covering:

  1. Bugs — logic errors, off-by-one errors, null handling, race conditions
  2. Security — injection vulnerabilities, unsafe operations, credential exposure
  3. Style — naming, idiomatic patterns, unnecessary complexity, dead code
  4. Performance — obvious inefficiencies, unnecessary allocations
  5. Suggestions — improvements, missing error handling, better approaches
/review              # review staged changes (or unstaged if nothing staged)
/review src/main.rs  # review a specific file
/review Cargo.toml   # review any file

This is one of the most common workflows for developers using coding agents — getting a second pair of eyes on your changes before committing.

Refactoring

CommandDescription
/refactorShow all refactoring tools with examples
/rename <old> <new>Cross-file symbol renaming with word-boundary matching
/extract <symbol> <source> <target>Move a symbol (fn, struct, enum, trait, type, const, static) between files
/move <Src>::<method> [file::]<Dst>Move a method between impl blocks (same file or cross-file)

/refactor — Refactoring tools overview

The /refactor command is an umbrella that shows all available refactoring tools at a glance. Run it with no arguments to see a summary with examples:

/refactor

You can also use it as a dispatch to any refactoring subcommand:

/refactor rename MyOldStruct MyNewStruct
/refactor extract parse_config src/lib.rs src/config.rs
/refactor move Parser::validate Validator

These are equivalent to calling /rename, /extract, or /move directly — use whichever form you prefer.

/rename — Cross-file symbol renaming

The /rename command does a smart find-and-replace across all git-tracked files, respecting word boundaries (renaming foo won't change foobar or my_foo). Shows a preview of all matches, then asks for confirmation.

/rename my_func new_func
/rename OldStruct NewStruct

/extract — Move symbols between files

The /extract command moves a top-level item (function, struct, enum, impl, trait, type alias, const, or static) from one file to another. It uses brace-depth tracking to find the full block, including doc comments and attributes above the declaration.

/extract my_func src/lib.rs src/utils.rs
/extract MyStruct src/main.rs src/types.rs
/extract MyTrait src/old.rs src/new.rs
/extract MyResult src/lib.rs src/errors.rs
/extract MAX_SIZE src/config.rs src/constants.rs

The command shows a preview of the block to be moved and asks for confirmation before making changes. If the target file doesn't exist, it's created. If the symbol is public, yoyo notes that you may need to add a use import in the source file.

/move — Relocate methods between impl blocks

The /move command moves a method from one impl block to another, within the same file or across files. It extracts the method (including doc comments and attributes), re-indents it to match the target block, and inserts it before the closing }. Shows a preview and asks for confirmation.

/move MyStruct::process TargetStruct           # same file
/move Parser::parse_expr other.rs::Lexer       # cross-file
/move Config::validate Settings                # same file

If the method uses self. references, yoyo warns you to verify that the field/method references are valid on the target type. This is a common source of bugs when relocating methods between different types.

rename_symbol — Agent-invocable rename tool

In addition to the interactive /rename REPL command, yoyo exposes a rename_symbol tool that the AI agent can call directly. This means the agent can rename symbols across files in a single tool call instead of issuing multiple edit_file calls — faster and more reliable for large refactors.

The tool accepts:

  • old_name (required) — the current symbol name
  • new_name (required) — the replacement name
  • path (optional) — limit scope to a specific file or directory

Like write_file and edit_file, rename_symbol asks for user confirmation before making changes (unless --yes is passed).

ask_user — Let the model ask you questions

The agent can ask you directed questions mid-task using the ask_user tool. Instead of guessing at your preferences or making assumptions, the model can pause and ask for clarification — a preference, a decision, or context that isn't available in the codebase.

This tool is only available in interactive mode (when stdin is a terminal). In piped mode, the tool is not registered — the model works with what it has.

The question appears with a ❓ prompt, and you type your response directly. If you press Enter with no text or hit EOF, the model receives a "(no response)" indicator and continues on its own.

Project Context

CommandDescription
/add <path>Add file contents into the conversation — the AI sees them immediately
/explain <file>Read code from a file and ask the agent to explain it
/context [system]Show which project context files are loaded, or use /context system to see system prompt sections with token estimates
/find <pattern>Fuzzy-search project files by name — respects .gitignore, ranked by relevance
/grep <pattern> [path]Search file contents directly — no AI, no tokens, instant results
/indexBuild a lightweight index of all project source files — shows path, line count, and first-line summary
/initScan the project and generate a YOYO.md context file with detected build commands, key files, and project structure
/tree [depth]Show project directory tree (default depth: 3, respects .gitignore)

/add — Inject file contents into conversation

The /add command reads files and injects their contents directly into the conversation as a user message. The AI sees the file immediately without needing to call read_file — similar to Claude Code's @file feature.

/add src/main.rs
  ✓ added src/main.rs (850 lines)
  (1 file added to conversation)

/add src/main.rs:1-50
  ✓ added src/main.rs (lines 1-50) (50 lines)
  (1 file added to conversation)

/add src/*.rs
  ✓ added src/cli.rs (400 lines)
  ✓ added src/commands.rs (3000 lines)
  ✓ added src/main.rs (850 lines)
  (3 files added to conversation)

/add Cargo.toml README.md
  ✓ added Cargo.toml (28 lines)
  ✓ added README.md (50 lines)
  (2 files added to conversation)

Features:

  • Line ranges/add path:start-end injects only the specified lines
  • Glob patterns/add src/*.rs expands to all matching files
  • Multiple files/add file1 file2 adds both in one message
  • Syntax highlighting — content is wrapped in fenced code blocks with language detection
  • No AI tokens used for reading — the file is read locally and injected directly

This is the fastest way to give the AI context about specific files without waiting for it to call tools.

The /find command does fuzzy substring matching across all tracked files in your project (via git ls-files, falling back to a directory walk if not in a git repo). Results are ranked by relevance — filename matches score higher than directory matches, and matches at the start of the filename rank highest.

/find main
  3 files matching 'main':
    src/main.rs
    site/book/index.html
    scripts/main_helper.sh

/find .toml
  2 files matching '.toml':
    Cargo.toml
    docs/book.toml

/grep — Search file contents directly

The /grep command searches file contents without using the AI — no tokens, no API call, instant results. This is one of the fastest ways to find code in your project.

/grep TODO
  src/main.rs:42: // TODO: handle edge case
  src/cli.rs:15: // TODO: add validation
  
  2 matches

/grep "fn main" src/
  src/main.rs:10: fn main() {
  
  1 match

/grep -s MyStruct src/lib.rs
  src/lib.rs:5: pub struct MyStruct {
  src/lib.rs:20: impl MyStruct {
  
  2 matches

Features:

  • Case-insensitive by default — use -s or --case for case-sensitive search
  • Git-aware — uses git grep in git repos (faster, respects .gitignore), falls back to grep -rn
  • Colored output — filenames in green, line numbers in cyan, matches highlighted in yellow
  • Truncated results — shows up to 50 matches with a "narrow your search" hint
  • Optional path/grep pattern src/ restricts search to a specific file or directory

The /tree command uses git ls-files to show tracked files in a visual tree structure, automatically respecting your .gitignore. You can specify a depth limit:

/tree        # default: 3 levels deep
/tree 1      # just top-level directories and their files
/tree 5      # deeper view

Example output:

src/
  cli.rs
  format.rs
  main.rs
  prompt.rs
Cargo.toml
README.md

/index — Codebase indexing

The /index command builds a lightweight in-memory index of your project's source files. For each text file tracked by git (or found via directory walk), it shows:

  • Path — the file path relative to the project root
  • Lines — the total line count
  • Summary — the first meaningful line (skipping blank lines), which is typically a doc comment, module declaration, or import statement

Binary files (images, fonts, archives, etc.) are automatically skipped.

/index
  Building project index...
  Path                Lines  Summary
  ──────────────────  ─────  ────────────────────────────────────────
  Cargo.toml             18  [package]
  src/cli.rs            400  //! CLI argument parsing and configuration.
  src/commands.rs      4500  //! REPL command handlers for yoyo.
  src/main.rs           850  //! yoyo — a coding agent that evolves itself.
  README.md              50  # yoyo

  5 files, 5818 total lines

This gives you a quick bird's-eye view of the entire codebase without needing to run find, list_files, or wc -l manually.

/map — Structural codebase map

The /map command generates a structural summary of your codebase, extracting function signatures, struct/class/trait/enum definitions, constants, and other symbols from source files. This is like a "table of contents" for your entire project.

/map
  Building repo map...

src/main.rs (850 lines)
  pub fn main
  pub struct AgentConfig
  impl AgentConfig

src/cli.rs (400 lines)
  pub fn parse_args
  pub struct Config
  pub const SYSTEM_PROMPT
  ...

  45 symbols across 8 files (using ast-grep)

Usage:

CommandDescription
/mapMap entire project (public symbols only)
/map src/Map only files under a specific directory
/map --allInclude private/non-exported symbols
/map --all src/All symbols under a specific directory
/map --regexForce regex backend (skip ast-grep)

Supported languages: Rust, Python, JavaScript, TypeScript, Go, Java.

ast-grep integration: When ast-grep (sg) is installed, /map uses it for more accurate AST-based symbol extraction. When ast-grep is not available, it falls back to built-in regex extractors. The output footer shows which backend was used. Use --regex to force the regex backend for comparison or debugging.

Automatic system prompt integration: The repo map is automatically included in the system prompt at the start of every session, giving the AI structural awareness of your codebase without you needing to manually add files. This is similar to Aider's repo-map feature. The system prompt version is limited to public symbols and capped at ~16K characters to avoid bloating context.

Project Onboarding with /init

The /init command scans your project and generates a YOYO.md context file automatically. It:

  1. Detects the project type — Rust, Node.js, Python, Go, or Makefile-based projects
  2. Finds the project name — from Cargo.toml, package.json, README.md title, or directory name
  3. Lists important files — README, config files, CI configs, lock files, etc.
  4. Lists key directoriessrc/, tests/, docs/, scripts/, etc.
  5. Generates build commandscargo build, npm test, go test ./..., etc. based on project type
/init
  Scanning project...
  Detected: Rust
  ✓ Created YOYO.md (32 lines) — edit it to add project context.

If YOYO.md or CLAUDE.md already exists, /init won't overwrite it. The generated file is a starting point — edit it to add your project's specific conventions and instructions.

Project Memory

CommandDescription
/remember <note>Save a project-specific note that persists across sessions
/memories [query]List all memories, or search by keyword
/forget <number>Remove a memory by its number

Project memories let you teach yoyo things about your project that it should always know — build quirks, team conventions, infrastructure requirements. Memories are stored in .yoyo/memory.json in your project root and are automatically injected into the system prompt at the start of every session.

Example workflow

> /remember this project uses sqlx for database access
  ✓ Remembered: "this project uses sqlx for database access" (1 total memories)

> /remember tests require docker running
  ✓ Remembered: "tests require docker running" (2 total memories)

> /memories
  Project memories (2):
    [0] this project uses sqlx for database access (2026-03-15 08:32)
    [1] tests require docker running (2026-03-15 08:33)

> /forget 0
  ✓ Forgot: "this project uses sqlx for database access" (1 memories remaining)

> /memories docker
  Found 1 memory matching 'docker':
    [1] tests require docker running (2026-03-15 08:33)

Use /memories <query> to filter by keyword when you have many memories. The search is case-insensitive.

Use /remember any time you find yourself repeating the same instruction to the agent. The memory will be there next time you start a session in this project directory.

Unknown commands

If you type a /command that yoyo doesn't recognize, it will tell you:

  unknown command: /foo
  type /help for available commands

Note: lines starting with / that contain spaces (like /model name) are treated as command arguments, not unknown commands.

Multi-Line Input

yoyo supports two ways to enter multi-line input.

Backslash continuation

End a line with \ to continue on the next line:

main > Please review this code and \
  ...  check for any bugs or \
  ...  performance issues.

The backslash and newline are removed, and the lines are joined. The ... prompt indicates yoyo is waiting for more input.

Code fences

Start a line with triple backticks (```) to enter a fenced code block. Everything until the closing ``` is collected as a single input:

main > ```
  ...  Here is a function I want you to review:
  ...  
  ...  fn parse(input: &str) -> Result<Config, Error> {
  ...      let data = serde_json::from_str(input)?;
  ...      Ok(Config::from(data))
  ...  }
  ...  
  ...  Is this handling errors correctly?
  ...  ```

This is useful for pasting code or structured text that spans multiple lines.

Models & Providers

yoyo supports 13 providers out of the box — from Anthropic and OpenAI to local models via Ollama.

Default model

The default model is claude-opus-4-6 (Anthropic). You can change it at startup or mid-session.

Changing the model

At startup:

yoyo --model claude-sonnet-4-20250514
yoyo --model gpt-4o --provider openai
yoyo --model llama3.2 --provider ollama

During a session:

/model claude-sonnet-4-20250514

Note: Switching models with /model preserves your conversation history — you can change models mid-task without losing context.

Providers

Use --provider <name> to select a provider. Each provider has a default model and an environment variable for its API key.

Tip: If you run yoyo without any API key configured, an interactive setup wizard will walk you through choosing a provider and entering your key. You can also save the config to .yoyo.toml directly from the wizard.

ProviderDefault ModelAPI Key Env Var
anthropic (default)claude-opus-4-6ANTHROPIC_API_KEY
openaigpt-4oOPENAI_API_KEY
googlegemini-2.0-flashGOOGLE_API_KEY
openrouteranthropic/claude-sonnet-4-20250514OPENROUTER_API_KEY
ollamallama3.2(none — local)
xaigrok-3XAI_API_KEY
groqllama-3.3-70b-versatileGROQ_API_KEY
deepseekdeepseek-chatDEEPSEEK_API_KEY
mistralmistral-large-latestMISTRAL_API_KEY
cerebrasllama-3.3-70bCEREBRAS_API_KEY
zaiglm-4-plusZAI_API_KEY
minimaxMiniMax-M2.7MINIMAX_API_KEY
customclaude-opus-4-6(none — bring your own)

Examples

# OpenAI
OPENAI_API_KEY=sk-... yoyo --provider openai

# Google Gemini
GOOGLE_API_KEY=... yoyo --provider google --model gemini-2.5-pro

# Local with Ollama (no API key needed)
yoyo --provider ollama --model llama3.2

# Custom endpoint (OpenAI-compatible API)
yoyo --provider custom --base-url http://localhost:8080/v1 --model my-model

You can also set these in .yoyo.toml:

provider = "openai"
model = "gpt-4o"
base_url = "https://api.openai.com/v1"

Cost estimation

Cost estimation is built in for many providers:

Model FamilyInput (per MTok)Output (per MTok)
Opus 4.5/4.6$5.00$25.00
Opus 4/4.1$15.00$75.00
Sonnet$3.00$15.00
Haiku 4.5$1.00$5.00
Haiku 3.5$0.80$4.00

Cost estimates are also available for OpenAI, Google, DeepSeek, Mistral, xAI, Groq, ZAI and more.

Context window

yoyo assumes a 200,000-token context window (the standard for Claude models). When usage exceeds 80% of this, auto-compaction kicks in. See Context Management.

System Prompts

yoyo has a built-in system prompt that instructs the model to act as a coding assistant. You can override it entirely via CLI flags or config file.

Default behavior

The default system prompt tells the model to:

  • Work as a coding assistant in the user's terminal
  • Be direct and concise
  • Use tools proactively (read files, run commands, verify work)
  • Do things rather than just explain how

Custom system prompt

Inline (CLI flag):

yoyo --system "You are a Rust expert. Focus on performance and safety."

From a file (CLI flag):

yoyo --system-file my-prompt.txt

In config file (.yoyo.toml):

# Inline text
system_prompt = "You are a Go expert. Follow Go idioms strictly."

# Or read from a file
system_file = "prompts/system.txt"

If both system_prompt and system_file are set in the config, system_file takes precedence (same as CLI behavior).

Precedence

When multiple sources provide a system prompt, the highest-priority one wins:

  1. --system-file (CLI flag) — highest priority
  2. --system (CLI flag)
  3. system_file (config file key)
  4. system_prompt (config file key)
  5. Built-in default — lowest priority

This means CLI flags always override config file values, and file-based prompts override inline text at each level.

Use cases

Custom system prompts are useful for:

  • Specializing the agent — focus on security review, documentation, or a specific language
  • Project context — tell the agent about your project's conventions
  • Team defaults — commit .yoyo.toml with system_prompt or system_file so every developer gets the same agent persona
  • Persona tuning — make the agent more or less verbose, formal, etc.

Viewing the assembled prompt

To see the full system prompt (including project context, repo map, skills, and any overrides), use:

yoyo --print-system-prompt

This prints the complete prompt to stdout and exits — useful for debugging or understanding exactly what context the model receives. It works with other flags:

# See what the prompt looks like with a custom system prompt
yoyo --system "You are a Rust expert" --print-system-prompt

# See the prompt without project context
yoyo --no-project-context --print-system-prompt

Inspecting during a session

Once inside the REPL, use /context system to see the system prompt broken into sections with approximate token counts for each:

/context system

This shows each markdown section (headers like # ... and ## ...), their line counts, estimated token usage, and a brief preview — without leaving the session.

Automatic project context

In addition to the system prompt, yoyo automatically injects project context when available:

  • Project instructions — from YOYO.md (primary), CLAUDE.md (compatibility alias), or .yoyo/instructions.md
  • Project file listing — from git ls-files (up to 200 files)
  • Recently changed files — from git log (up to 20 files)
  • Git status — current branch, count of uncommitted and staged changes
  • Project memories — from memory/ files if present

Use /context to see which project context files are loaded.

Example prompt file

You are a senior Rust developer reviewing code for a production system.
Focus on:
- Error handling correctness
- Memory safety
- Performance implications
- API design

Be concise. Point out issues with line numbers.

Save as review-prompt.txt and use:

# Via CLI flag
yoyo --system-file review-prompt.txt -p "review src/main.rs"

Or set it in your project's .yoyo.toml:

system_file = "review-prompt.txt"

Extended Thinking

Extended thinking gives the model more "reasoning time" before responding. This can improve quality for complex tasks like debugging, architecture decisions, or multi-step refactoring.

Usage

yoyo --thinking high
yoyo --thinking medium
yoyo --thinking low
yoyo --thinking minimal
yoyo --thinking off

Levels

LevelAliasesDescription
offnoneNo extended thinking (default)
minimalminVery brief reasoning
lowShort reasoning
mediummedModerate reasoning
highmaxDeep reasoning — best for complex tasks

Levels are case-insensitive: HIGH, High, and high all work.

If you provide an unrecognized level, yoyo defaults to medium with a warning.

When to use it

  • Complex debugging — use high when the bug is subtle
  • Architecture decisions — use medium or high for design questions
  • Simple tasks — use off (the default) for quick file reads, simple edits, etc.

Output

When thinking is enabled, the model's reasoning is shown dimmed in the output so you can follow along without it cluttering the main response.

Trade-offs

Higher thinking levels use more tokens (and thus cost more) but often produce better results for hard problems. For routine tasks, the overhead isn't worth it.

Skills

Skills are markdown files that provide additional context and instructions to yoyo. They're loaded at startup and added to the agent's context.

Usage

yoyo --skills ./skills

You can pass multiple skill directories:

yoyo --skills ./skills --skills ./my-custom-skills

What is a skill?

A skill file is a markdown file with YAML frontmatter. It contains instructions, rules, or context that the agent should follow. For example:

---
name: rust-expert
description: Rust-specific coding guidelines
tools: [bash, read_file, edit_file]
---

# Rust Guidelines

- Always use `clippy` before committing
- Prefer `?` over `.unwrap()` in production code
- Write tests for every public function

Built-in skills

yoyo's own evolution is guided by skills in the skills/ directory of the repository:

  • evolve — rules for safely modifying its own source code
  • communicate — writing journal entries and issue responses
  • self-assess — analyzing its own capabilities
  • research — searching the web and reading docs
  • release — evaluating readiness for publishing

MCP servers

yoyo can connect to Model Context Protocol (MCP) servers, giving the agent access to external tools provided by any MCP-compatible server. Use the --mcp flag with a shell command that starts the server via stdio:

yoyo --mcp "npx -y @modelcontextprotocol/server-fetch"

The flag is repeatable — connect to multiple MCP servers in a single session:

yoyo \
  --mcp "npx -y @modelcontextprotocol/server-fetch" \
  --mcp "npx -y @modelcontextprotocol/server-github" \
  --mcp "python my_custom_server.py"

MCP in config files

You can also configure MCP servers in .yoyo.toml, ~/.yoyo.toml, or ~/.config/yoyo/config.toml, so they connect automatically without needing CLI flags:

mcp = ["npx -y @modelcontextprotocol/server-fetch", "npx open-websearch@latest"]

MCP servers from the config file are merged with any --mcp CLI flags — both sources contribute. CLI flags are additive, not overriding.

Each --mcp command is launched as a child process. yoyo communicates with it over stdio using the MCP protocol, discovers the tools it offers, and makes them available to the agent alongside the built-in tools.

Tool-name collisions

yoyo's builtin tools (bash, read_file, write_file, edit_file, list_files, search, rename_symbol, ask_user, todo, sub_agent) take precedence over MCP tools. If an MCP server exposes a tool with one of those names, yoyo will skip the entire server at connect time with a warning on stderr — the colliding tool would otherwise cause the provider API to reject the first turn with "Tool names must be unique" and kill the session.

Note: @modelcontextprotocol/server-filesystem exposes read_file and write_file and will therefore be skipped. Prefer servers with distinct tool names such as @modelcontextprotocol/server-fetch, @modelcontextprotocol/server-memory, or @modelcontextprotocol/server-sequential-thinking — or a filesystem server that prefixes its tools (e.g. fs_read_file).

OpenAPI specs

You can give yoyo access to any HTTP API by pointing it at an OpenAPI specification file. yoyo parses the spec and registers each endpoint as a callable tool:

yoyo --openapi ./petstore.yaml

Like --mcp, this flag is repeatable:

yoyo --openapi ./api-v1.yaml --openapi ./internal-api.json

Both YAML and JSON spec formats are supported.

Additional configuration flags

Beyond skills, MCP, and OpenAPI, a few other flags fine-tune agent behavior:

--temperature <float>

Set the sampling temperature (0.0–1.0). Lower values make output more deterministic; higher values make it more creative. Defaults to the model's own default.

yoyo --temperature 0.2   # More focused/deterministic
yoyo --temperature 0.9   # More creative/varied

--max-turns <int>

Limit the number of agentic turns (tool-use loops) per prompt. Defaults to 50. Useful for keeping costs predictable or preventing runaway tool loops:

yoyo --max-turns 10

Both flags can also be set in .yoyo.toml:

temperature = 0.5
max_turns = 20

--no-bell

Disable the terminal bell notification that rings after long-running prompts (≥3 seconds). By default, yoyo sends a bell character (\x07) when a prompt completes, which causes most terminals to flash the tab or play a sound — useful when you switch away while waiting. Disable it with the flag or environment variable:

yoyo --no-bell
YOYO_NO_BELL=1 yoyo

--no-update-check

Skip the startup update check. On startup (interactive REPL mode only), yoyo checks GitHub for a newer release and shows a notification if one exists. The check uses a 3-second timeout and fails silently on network errors. Disable it with the flag or environment variable:

yoyo --no-update-check
YOYO_NO_UPDATE_CHECK=1 yoyo

The update check is automatically skipped in non-interactive modes (piped input, --prompt flag).

YOYO_SESSION_BUDGET_SECS

Soft wall-clock budget for an entire yoyo session, in seconds. Unset by default — interactive sessions are unbounded. When set, yoyo exposes a session_budget_remaining() helper that long-running loops (like the self-evolution pipeline) can poll to voluntarily wind down before an external timeout cancels them.

YOYO_SESSION_BUDGET_SECS=2700 yoyo   # 45-minute soft budget

The timer starts on the first call to the helper, not at process startup, so CI cold-start time doesn't burn the budget. If the env var is set but unparseable, yoyo falls back to the 45-minute default rather than silently disabling the guard. This was added to mitigate hourly cron overlap in the evolution workflow (#262).

Error handling

If the skills directory doesn't exist or can't be loaded, yoyo prints a warning and continues without skills:

warning: Failed to load skills: ...

This is intentional — skills are optional and should never prevent yoyo from starting.

Permissions & Safety

yoyo asks for confirmation before running tools that modify your system. This page covers how to control that behavior — from interactive prompts to fine-grained allow/deny rules.

Interactive Permission Prompts

By default, yoyo prompts you before executing any potentially dangerous tool:

  • bash — every shell command asks for [y/N] confirmation
  • write_file — creating or overwriting files asks for approval
  • edit_file — modifying existing files asks for approval
  • rename_symbol — cross-file symbol renaming asks for approval

Read-only tools (read_file, list_files, search) and the ask_user tool run without prompting.

When a tool needs approval, you'll see something like:

⚡ bash: git status
  Allow? [y/N]

Type y to approve, or n (or just press Enter) to deny.

Auto-Approve Everything: --yes / -y

If you trust the agent fully (e.g., in a sandboxed environment or CI pipeline), skip all prompts:

yoyo -y -p "refactor the auth module"

This auto-approves every tool call — bash commands, file writes, everything.

⚠️ Use with caution. This gives yoyo unrestricted access to your shell and filesystem.

Command Filtering: --allow and --deny

For finer control over which bash commands run automatically, use glob patterns:

yoyo --allow "git *" --allow "cargo *" --deny "rm -rf *"

How it works

  1. Deny is checked first. If a command matches any --deny pattern, it's rejected immediately — the agent sees an error message and must try something else.
  2. Allow is checked second. If a command matches any --allow pattern, it runs without prompting.
  3. No match = prompt. Commands that don't match either list get the normal [y/N] prompt.

Patterns use simple glob matching where * matches any sequence of characters (including empty):

PatternMatchesDoesn't match
git *git status, git commit -m "hello"echo git, gitignore
*.rsmain.rs, src/main.rsmain.py
cargo * --releasecargo build --releasecargo build --debug
rm -rf *rm -rf /, rm -rf /tmprm file.txt
*everything

Both --allow and --deny are repeatable — pass them multiple times to build up your pattern lists.

Deny overrides allow

If both an allow and deny pattern match the same command, deny wins:

# This allows all commands EXCEPT rm -rf
yoyo --allow "*" --deny "rm -rf *"

The command rm -rf /tmp matches * (allow) and rm -rf * (deny) — deny takes priority, so it's blocked.

Directory Restrictions: --allow-dir and --deny-dir

Restrict which directories yoyo's file tools can access:

yoyo --allow-dir ./src --allow-dir ./tests --deny-dir ~/.ssh

This affects read_file, write_file, edit_file, list_files, and search.

Rules

  • If --allow-dir is set, only paths under allowed directories are accessible. Everything else is blocked.
  • If --deny-dir is set, paths under denied directories are blocked.
  • Deny overrides allow — if a path is under both an allowed and a denied directory, it's blocked.
  • Paths are resolved to absolute paths before checking, so ../ traversal escapes are caught.
  • Symlinks are resolved via canonicalize when the path exists.

Example: lock yoyo to your project

yoyo --allow-dir . --deny-dir ./.git --deny-dir ~/.ssh

This lets yoyo read and write anywhere in the current project, but blocks access to .git internals and your SSH keys.

Config File

Instead of passing flags every time, put your permission rules in .yoyo.toml (project-level), ~/.yoyo.toml (home directory), or ~/.config/yoyo/config.toml (XDG):

[permissions]
allow = ["git *", "cargo *", "echo *"]
deny = ["rm -rf *", "sudo *"]

[directories]
allow = ["./src", "./tests"]
deny = ["~/.ssh", "/etc"]

Precedence

CLI flags override config file values:

  • If you pass any --allow or --deny flag, the entire [permissions] section from the config file is ignored.
  • If you pass any --allow-dir or --deny-dir flag, the entire [directories] section from the config file is ignored.
  • --yes / -y overrides everything — all tools are auto-approved regardless of permission patterns.

Config file search order (first found wins):

  1. .yoyo.toml in the current directory
  2. ~/.yoyo.toml in your home directory
  3. ~/.config/yoyo/config.toml

Practical Examples

Rust development — approve common tools

yoyo --allow "git *" --allow "cargo *" --allow "cat *" --allow "ls *"

Or in .yoyo.toml:

[permissions]
allow = ["git *", "cargo *", "cat *", "ls *", "echo *"]
deny = ["rm -rf *", "sudo *"]

Sandboxed CI — trust everything

yoyo -y -p "run the test suite and fix any failures"

Paranoid mode — restrict to source files only

yoyo --allow-dir ./src --allow-dir ./tests --deny "rm *" --deny "sudo *"

Read-only exploration

yoyo --deny "*" --allow "cat *" --allow "ls *" --allow "grep *" --allow-dir .

This denies all bash commands except read-only ones, and restricts file access to the current directory.

Built-in Command Safety Analysis

Beyond pattern matching, yoyo has a built-in safety analyzer that detects categories of dangerous commands and provides specific warnings. This runs automatically — you don't need to configure it.

Detected patterns include:

CategoryExamples
Filesystem destructionrm -rf /, rm -rf ~
Force git operationsgit push --force, git reset --hard
Permission changeschmod -R 777, chown -R on system dirs
File overwrites> /etc/passwd, > ~/.bashrc
System commandsshutdown, reboot, halt
Database destructionDROP TABLE, DROP DATABASE, TRUNCATE TABLE
Pipe from internetcurl ... | bash, wget ... | sh
Process killingkill -9 1, killall
Disk operationsdd if=, fdisk, parted, mkfs

When a dangerous pattern is detected, yoyo shows a warning explaining why the command is flagged before asking for confirmation. A handful of truly catastrophic patterns (like rm -rf / or fork bombs) are hard-blocked and can never execute, even with --yes.

Safe commands like ls, cargo test, git status, and grep pass through without triggering any warnings.

Summary

MechanismScopeEffect
Default promptsAll modifying toolsAsk [y/N] before each call
--yes / -yEverythingAuto-approve all tools
--allow <pattern>Bash commandsAuto-approve matching commands
--deny <pattern>Bash commandsAuto-reject matching commands
--allow-dir <dir>File toolsOnly allow paths under these dirs
--deny-dir <dir>File toolsBlock paths under these dirs
[permissions] in configBash commandsSame as --allow/--deny
[directories] in configFile toolsSame as --allow-dir/--deny-dir

Tip: Use /permissions during a session to see the full security posture — auto-approve status, command patterns, and directory restrictions all in one view.

Session Persistence

yoyo can save and load conversations, letting you resume where you left off.

Auto-save on exit

yoyo automatically saves your conversation to .yoyo/last-session.json every time you exit the REPL — whether via /quit, /exit, Ctrl-D, or even unexpected termination. No flags needed.

If a previous session is detected on startup, yoyo prints a hint:

  💡 Previous session found. Use --continue or /load .yoyo/last-session.json to resume.

Resuming with --continue

The --continue (or -c) flag restores the last auto-saved session:

yoyo --continue
yoyo -c

When --continue is used:

  1. On startup, yoyo loads from .yoyo/last-session.json (preferred) or yoyo-session.json (legacy fallback)
  2. On exit, the conversation is auto-saved as usual
$ yoyo -c
  resumed session: 8 messages from .yoyo/last-session.json

main > what were we working on?

Manual save/load

Save the current conversation:

/save

This writes to yoyo-session.json in the current directory.

Save to a custom path:

/save my-session.json

Load a conversation:

/load
/load my-session.json
/load .yoyo/last-session.json

Session format

Sessions are stored as JSON files containing the conversation message history. The format is determined by the yoagent library.

Error handling

  • If no previous session exists when using --continue, yoyo prints a message and starts fresh
  • If a session file is corrupt or can't be parsed, yoyo warns you and starts fresh
  • Empty conversations (no messages exchanged) are not auto-saved
  • Save errors are reported but don't crash yoyo

Context Management

Claude models have a finite context window (200,000 tokens). As your conversation grows, it fills up. yoyo helps you manage this.

Checking context usage

Use /tokens to see how full your context window is:

/tokens

Output:

  Active context:
    messages:    24
    current:     85.2k / 200.0k tokens
    ████████░░░░░░░░░░░░ 43%

  Session totals (all API calls):
    input:       120.5k tokens
    output:      45.2k tokens
    cache read:  30.0k tokens
    cache write: 15.0k tokens
    est. cost:   $0.892

When the context window exceeds 75%, you'll see a warning:

    ⚠ Context is getting full. Consider /clear or /compact.

Manual compaction

Use /compact to compress the conversation:

/compact

This summarizes older messages while preserving recent context. You'll see:

  compacted: 24 → 8 messages, ~85.2k → ~32.1k tokens

Auto-compaction

When the context window exceeds 80% capacity, yoyo automatically compacts the conversation. You'll see:

  ⚡ auto-compacted: 30 → 10 messages, ~165.0k → ~62.0k tokens

This happens transparently after each prompt response. You don't need to do anything — yoyo handles it.

Clearing the conversation

If you want to start completely fresh:

/clear

This removes all messages and resets the conversation. Unlike /compact, nothing is preserved.

Tips

  • For long sessions, use /tokens periodically to monitor usage
  • If you notice the agent losing track of earlier context, try /compact
  • Starting a new task? Use /clear to avoid confusing the agent with unrelated history

Checkpoint-restart strategy

For automated pipelines (like CI scripts), compaction can be lossy. The --context-strategy checkpoint flag provides an alternative: when context usage exceeds 70%, yoyo stops the agent loop and exits with code 2.

yoyo --context-strategy checkpoint -p "do some long task"
# Exit code 2 means "context was getting full — restart me"

The calling script can then restart yoyo with fresh context. This is useful for multi-phase pipelines where a structured restart produces better results than lossy compaction.

The default strategy is compaction, which uses auto-compaction as described above.

Git Integration

yoyo is git-aware. It shows your current branch and provides commands for common git operations.

Branch display

When you're in a git repository, the REPL prompt shows the current branch:

main > _
feature/new-parser > _

On startup, the branch is also shown in the status information:

  git:   main

Git commands

/diff

Show a summary of uncommitted changes (equivalent to git diff --stat):

/diff

Output:

 src/main.rs | 15 +++++++++------
 README.md   |  3 +++
 2 files changed, 12 insertions(+), 6 deletions(-)

If there are no uncommitted changes:

  (no uncommitted changes)

/git diff

Show the actual diff content (line-by-line changes), not just a summary:

/git diff

Shows unstaged changes. To see staged changes instead:

/git diff --cached

/git branch

List all branches, with the current branch highlighted in green:

/git branch

Create and switch to a new branch:

/git branch feature/my-new-feature

/blame

Show who last modified each line of a file, with colorized output:

/blame src/main.rs

Limit to a specific line range:

/blame src/main.rs:10-20

Output is colorized: commit hashes (dim), author names (cyan), dates (dim), line numbers (yellow).

/undo

Revert all uncommitted changes. This is equivalent to git checkout -- .:

/undo

Before reverting, /undo shows you what will be undone:

 src/main.rs | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)
  ✓ reverted all uncommitted changes

If there's nothing to undo:

  (nothing to undo — no uncommitted changes)

Using git through the agent

yoyo's bash tool can run any git command. You can ask the agent directly:

> commit these changes with message "fix: handle empty input"
> show me the last 5 commits
> create a new branch called feature/parser

The agent has full access to git through its shell tool.

Cost Tracking

yoyo estimates the cost of each interaction so you can monitor spending.

Per-turn costs

After each response, you'll see a compact token summary:

  ↳ 3.2s · 1523→842 tokens · $0.0234

With --verbose (or -v), you get the full breakdown:

  tokens: 1523 in / 842 out  [cache: 1000 read, 500 write]  (session: 4200 in / 2100 out)  cost: $0.0234  total: $0.0567  ⏱ 3.2s
  • cost — estimated cost for this turn
  • total — estimated cumulative cost for the session

Quick cost check

Use /cost for a quick overview with a breakdown by cost category:

  Session cost: $0.0567
    4.2k in / 2.1k out
    cache: 1.0k read / 500 write

    Breakdown:
      input:       $0.0126
      output:      $0.0315
      cache write: $0.0031
      cache read:  $0.0005

Detailed breakdown

Use /tokens to see a full breakdown including cache usage:

  Session totals:
    input:       120.5k tokens
    output:      45.2k tokens
    cache read:  30.0k tokens
    cache write: 15.0k tokens
    est. cost:   $0.892

Supported models

Costs are estimated based on published pricing for all major providers:

Anthropic

ModelInputCache WriteCache ReadOutput
Opus 4.5/4.6$5/MTok$6.25/MTok$0.50/MTok$25/MTok
Opus 4/4.1$15/MTok$18.75/MTok$1.50/MTok$75/MTok
Sonnet$3/MTok$3.75/MTok$0.30/MTok$15/MTok
Haiku 4.5$1/MTok$1.25/MTok$0.10/MTok$5/MTok
Haiku 3.5$0.80/MTok$1/MTok$0.08/MTok$4/MTok

OpenAI

ModelInputOutput
GPT-4.1$2/MTok$8/MTok
GPT-4.1 Mini$0.40/MTok$1.60/MTok
GPT-4.1 Nano$0.10/MTok$0.40/MTok
GPT-4o$2.50/MTok$10/MTok
GPT-4o Mini$0.15/MTok$0.60/MTok
o3$2/MTok$8/MTok
o3-mini$1.10/MTok$4.40/MTok
o4-mini$1.10/MTok$4.40/MTok

Google

ModelInputOutput
Gemini 2.5 Pro$1.25/MTok$10/MTok
Gemini 2.5 Flash$0.15/MTok$0.60/MTok
Gemini 2.0 Flash$0.10/MTok$0.40/MTok

DeepSeek

ModelInputOutput
DeepSeek Chat/V3$0.27/MTok$1.10/MTok
DeepSeek Reasoner/R1$0.55/MTok$2.19/MTok

Mistral

ModelInputOutput
Mistral Large$2/MTok$6/MTok
Mistral Small$0.10/MTok$0.30/MTok
Codestral$0.30/MTok$0.90/MTok

xAI (Grok)

ModelInputOutput
Grok 3$3/MTok$15/MTok
Grok 3 Mini$0.30/MTok$0.50/MTok
Grok 2$2/MTok$10/MTok

Groq (hosted models)

ModelInputOutput
Llama 3.3 70B$0.59/MTok$0.79/MTok
Llama 3.1 8B$0.05/MTok$0.08/MTok
Mixtral 8x7B$0.24/MTok$0.24/MTok
Gemma2 9B$0.20/MTok$0.20/MTok

MTok = million tokens.

OpenRouter

Models accessed through OpenRouter (e.g., anthropic/claude-sonnet-4-20250514) are automatically recognized — the provider prefix is stripped before matching.

Limitations

  • Cost estimates are approximate — actual billing may differ slightly
  • For unrecognized models, no cost estimate is shown
  • Cache read/write costs only apply to Anthropic models; other providers show zero cache costs
  • Pricing may change — check your provider's pricing page for the latest rates

Keeping costs down

  • Use smaller models (Haiku, Sonnet, GPT-4.1 Mini, Gemini Flash) for simple tasks
  • Use /compact to reduce context size (fewer input tokens per turn)
  • Use single-prompt mode (-p) for quick questions to avoid accumulating context
  • Turn off extended thinking for routine tasks

Architecture

This page explains the reasoning behind yoyo's internal design — why the codebase is shaped the way it is, what trade-offs were made, and what invariants contributors should understand before changing things. For a machine-generated dependency graph, see DeepWiki.

Why 13 modules instead of 3?

yoyo started as a single 200-line file. By Day 10 it was a single 3,400-line main.rs. That file was split over Days 10–15 into the current structure, not because someone sat down and designed thirteen modules, but because the code kept telling us where the seams were.

The split follows a simple heuristic: if two chunks of code change for different reasons, they belong in different files. Adding a new /git subcommand shouldn't force you to scroll past the markdown renderer. Fixing a cost-calculation bug shouldn't put you in the same file as the CLI argument parser.

The current modules, from smallest to largest:

ModuleLinesRole
memory.rs~375Project-specific .yoyo/memory.json persistence
docs.rs~550Fetching and parsing docs.rs HTML
help.rs~840Per-command help text and /help handler
git.rs~1,080Low-level git operations (branch, commit, diff)
commands_git.rs~1,130/commit, /diff, /undo, /pr, /review handlers
repl.rs~1,270Readline loop, tab completion, multi-line input
commands_session.rs~1,340/save, /load, /export, /spawn, /mark, /jump
main.rs~1,560Entry point, agent construction, tool wiring
prompt.rs~1,870Agent execution, streaming event loop, retry logic
cli.rs~2,520Argument parsing, config files, provider selection
commands.rs~2,910Core command dispatch, re-exports sub-modules
commands_project.rs~3,660/add, /fix, /test, /lint, /tree, /find, /web, /plan
format.rs~4,700Colors, markdown rendering, cost calc, spinner, diffs

Thirteen modules is a lot for ~24k lines. The alternative — three or four large files — would be easier to navigate in a directory listing but harder to work in. When a module is under 1,500 lines, you can hold its entire API in your head. When it's 4,700 lines (like format.rs), you start wanting to split it further — and that's a fair instinct, discussed below.

The layered design and why it matters

The modules form five rough layers, and the key invariant is: dependencies only point downward.

  ┌─────────────────────────────────────────────────┐
  │  Entry          main.rs                         │
  ├─────────────────────────────────────────────────┤
  │  REPL           repl.rs                         │
  ├─────────────────────────────────────────────────┤
  │  Commands       commands.rs                     │
  │                 commands_git.rs                  │
  │                 commands_project.rs              │
  │                 commands_session.rs              │
  │                 help.rs                          │
  ├─────────────────────────────────────────────────┤
  │  Engine         prompt.rs       format.rs       │
  ├─────────────────────────────────────────────────┤
  │  Utilities      git.rs   memory.rs   docs.rs    │
  └─────────────────────────────────────────────────┘

Entry layer. main.rs parses CLI args (via cli.rs), builds the agent, wires up tools with permission checks, and hands control to either repl.rs (interactive) or prompt.rs (single-prompt / piped mode). It owns the AgentConfig struct and the build_agent() / configure_agent() functions. It also defines StreamingBashTool, a custom replacement for yoagent's default BashTool that reads subprocess stdout/stderr line-by-line via tokio::io::AsyncBufReadExt and emits periodic ToolExecutionUpdate events through the on_update callback. This means when a user runs cargo build or npm install, partial output appears in real-time instead of after the command finishes. The reasoning: agent construction is complex (provider selection, tool wiring, MCP/OpenAPI setup, permission configuration) and shouldn't be tangled with either the REPL loop or command handlers.

REPL layer. repl.rs owns the readline loop, tab completion, multi-line input detection, and the big match block that dispatches / commands. It depends on nearly everything below it because it's the traffic cop — but nothing depends on it. This is intentional: piped mode and single-prompt mode bypass the REPL entirely and go straight to prompt.rs.

Command layer. commands.rs is the hub — it re-exports handlers from three sub-modules (commands_git.rs, commands_project.rs, commands_session.rs) and help.rs. The sub-module split follows domain, not size: git-workflow commands in one file, project-workflow commands in another, session-management commands in a third. This means adding a new /git stash pop subcommand only touches commands_git.rs, even though commands_project.rs is three times larger. The split is by reason-to-change, not by line count.

Engine layer. prompt.rs and format.rs are the two largest modules by complexity. prompt.rs runs the agent, processes the streaming event channel, handles retries on transient errors, and manages context overflow (auto-compaction). format.rs handles everything the user sees: ANSI colors, the incremental MarkdownRenderer, cost calculations for seven providers, the terminal spinner, diff formatting, and dozens of small display utilities. These two modules sit at the same layer because they collaborate tightly — prompt.rs feeds events to format.rs's renderer — but neither depends on commands or the REPL.

Utility layer. git.rs, memory.rs, and docs.rs are leaf modules with no upward dependencies. They wrap external systems (git CLI, filesystem JSON, docs.rs HTTP) behind clean Rust APIs. Any module above can call into them, but they never call up. This makes them easy to test in isolation — and they are: git.rs has 41 tests, memory.rs has 14, docs.rs has 23.

The layering isn't enforced by the compiler — Rust's module system doesn't prevent circular use crate:: imports at the module level. It's enforced by convention and by the fact that violations immediately feel wrong: if git.rs needed to call a command handler, that would be a sign the abstraction is leaking.

Why format.rs is the largest file

At ~4,700 lines with 256 tests, format.rs is twice the size of any other module. This isn't accidental — it's the consequence of a design choice: all terminal presentation logic lives in one place.

The module contains:

  • Color system — the Color wrapper that respects NO_COLOR, all ANSI color constants
  • MarkdownRenderer — incremental streaming renderer that turns text deltas into ANSI-colored output with syntax highlighting, handling code blocks, headers, bold/italic, lists, and inline code as tokens arrive
  • Cost calculations — pricing tables for seven providers, input/output/cache cost breakdowns
  • Spinner — background activity indicator for API roundtrips
  • Display utilitiespluralize, truncate, context_bar, format_duration, format_token_count, format_edit_diff, format_tool_summary, and more

The alternative would be splitting into color.rs, renderer.rs, cost.rs, etc. That's probably the right move eventually. But today, having all presentation in one file has a benefit: when you change how something looks, you only need to look in one place. The MarkdownRenderer uses the color system, cost formatting uses the color system, the spinner uses the color system — they're coupled by the shared presentation layer, and co-location makes that coupling visible rather than hiding it across five small files.

The 256 tests are the reason this works at ~4,700 lines. Every public function has test coverage. The MarkdownRenderer alone has tests for every markdown construct it handles. If those tests didn't exist, the file would be unmaintainable at this size.

Why cli.rs is so large

cli.rs (~2,520 lines) handles three jobs that sound simple but aren't:

  1. Argument parsing — yoyo doesn't use clap or structopt. Arguments are parsed by hand from std::env::args. This was a deliberate choice: the CLI has unusual needs (multi-value --mcp flags, --provider with fallback chains, config file merging) that are easier to handle with custom parsing than with a framework's escape hatches. The trade-off is more code in cli.rs, but zero macro magic and full control over error messages.

  2. Config file merging.yoyo.toml and YOYO.md settings merge with CLI flags and environment variables, with a clear precedence chain. This merging logic accounts for hundreds of lines.

  3. Provider configuration — selecting the right API key, endpoint, and default model for each of eight providers, including fallback behavior when keys aren't set.

The 92 tests in cli.rs verify the parsing of every flag and every merge scenario. Adding a new CLI flag means adding it in one place and adding a test.

The command dispatch pattern

Every /command follows the same pattern:

  1. User types /foo bar baz in the REPL
  2. repl.rs matches on "/foo" and calls commands::handle_foo(args, agent, ...)
  3. The handler does its work, possibly calling into utility modules
  4. If it needs the LLM, it calls prompt::run_prompt() with a constructed input

This pattern is enforced by convention, not by a trait. Early versions tried a Command trait with execute(), but it added ceremony without value — every command has different arguments, different return types, and different needs (some need the agent, some don't, some are async, some aren't). A simple function per command turned out to be the right abstraction level.

The commands.rs hub re-exports all handlers so the REPL only needs use crate::commands::*. The sub-modules (commands_git, commands_project, commands_session) group by domain. When you run /commit, the REPL calls handle_commit(), which is defined in commands_git.rs and re-exported through commands.rs.

Why prompt.rs handles retries internally

prompt.rs encapsulates the entire agent interaction lifecycle: sending the prompt, receiving streaming events, rendering output, and handling errors. Retry logic lives here — not in the REPL or in main.rs — because retries need access to the event stream state.

Three kinds of retries happen:

  • Tool failures — if a tool execution fails, the error is sent back to the LLM as context and it retries (up to 2 times). This happens inside the agent's own loop.
  • Transient API errors (429, 5xx) — retried with exponential backoff. The REPL doesn't need to know this happened.
  • Context overflow — when the conversation exceeds the context window, prompt.rs triggers auto-compaction (asking the LLM to summarize the conversation so far) and retries with the compressed context.

Keeping this in prompt.rs means the REPL's contract is simple: call run_prompt(), get back a PromptOutcome with the response text, token usage, and any unrecoverable errors. The REPL never has to think about retries, backoff, or context management.

The streaming renderer design

yoyo streams LLM output token-by-token. The MarkdownRenderer in format.rs is an incremental state machine that receives text deltas (often partial words or half a markdown construct) and emits ANSI-colored output. This is architecturally significant because:

  • It can't buffer entire lines. If it did, the output would appear in chunks instead of flowing. An early version had this bug — it was technically correct but felt broken. (Day 17 fix.)
  • It must track state across deltas. When a delta contains ` and the next delta contains rs, the renderer must know it's inside a code block header. The state machine tracks: are we in a code block? What language? Are we in bold? Italic? A header? A list item?
  • It must handle malformed markdown gracefully. LLMs sometimes emit unclosed code blocks, nested formatting that doesn't resolve, or markdown-like syntax that isn't actually markdown. The renderer must produce reasonable output regardless.

The alternative — buffering the entire response and rendering it at the end — would be simpler but would make the tool feel unresponsive. Streaming is a UX requirement that imposes real architectural cost.

Invariants contributors should know

No upward dependencies from utilities. git.rs, memory.rs, and docs.rs must never use crate::commands or use crate::repl. If you find yourself wanting to, the abstraction boundary is wrong.

format.rs is the only module that writes ANSI escape codes. Other modules call format::Color, format::DIM, etc. — they don't hardcode escape sequences. This is enforced by convention and makes NO_COLOR support work globally.

Every command handler is a standalone function. No command state persists between invocations (except through the Agent's conversation history and SessionChanges). This makes commands testable in isolation.

Tests live next to the code they test. Each module has a #[cfg(test)] mod tests block at the bottom. The project has ~1,000 tests total. Integration tests live in tests/integration.rs and test the CLI binary as a black box.

The agent is the only LLM dependency. yoyo delegates all LLM interaction to the yoagent crate. prompt.rs receives AgentEvents through a channel — it never constructs HTTP requests or parses API responses directly. This means swapping the LLM backend (or the entire agent framework) would only require changes to main.rs (construction) and prompt.rs (event handling).

Trade-offs and known debt

format.rs should probably be split. The MarkdownRenderer, cost tables, and color utilities are three distinct concerns sharing a file. The blocker isn't technical — it's that all three are coupled through the color system, and splitting would require deciding where Color lives.

Hand-rolled CLI parsing is a maintenance burden. Every new flag requires manual parsing code, help text updates, and config file support. A framework like clap would reduce this at the cost of a dependency and less control over error messages. The current approach works because flags don't change often.

commands.rs as a hub creates a wide dependency surface. Because it re-exports everything, changing any command sub-module can trigger recompilation of anything that imports commands::*. In a larger project this would matter for build times. At ~24k lines, it doesn't yet.

No trait abstraction for commands. This is fine at the current scale but means there's no compile-time guarantee that all commands follow the same pattern. A new contributor might put command logic directly in repl.rs instead of in a handler function. Code review catches this, not the type system.

Grow Your Own Agent

Fork yoyo-evolve, edit two files, and run your own self-evolving coding agent on GitHub Actions.

What You Get

A coding agent that:

  • Runs on GitHub Actions every ~8 hours
  • Reads its own source code, picks improvements, implements them
  • Writes a journal of its evolution
  • Responds to community issues in its own voice
  • Gets smarter over time through a persistent memory system

Quick Start

1. Fork the repo

Fork yologdev/yoyo-evolve on GitHub.

2. Edit your agent's identity

IDENTITY.md — your agent's constitution: name, mission, goals, and rules.

PERSONALITY.md — your agent's voice: how it writes, speaks, and expresses itself.

These are the only files you need to edit. Everything else auto-detects.

3. Choose your provider

yoyo supports 13+ providers out of the box. Pick the one that fits your budget and preferences:

ProviderEnv VarDefault ModelNotes
anthropicANTHROPIC_API_KEYclaude-opus-4-6Default. Best overall quality.
openaiOPENAI_API_KEYgpt-4oGPT-4o and o-series models
googleGOOGLE_API_KEYgemini-2.0-flashGemini models
openrouterOPENROUTER_API_KEYanthropic/claude-sonnet-4-20250514Multi-provider gateway
deepseekDEEPSEEK_API_KEYdeepseek-chatVery cost-effective
groqGROQ_API_KEYllama-3.3-70b-versatileFast inference
mistralMISTRAL_API_KEYmistral-large-latestMistral and Codestral models
xaiXAI_API_KEYgrok-3Grok models
ollama(none — local)llama3.2Free, runs on your hardware

For the full list of providers and models, see Models & Providers.

Tip: Anthropic is the default and what yoyo itself uses to evolve. If you're unsure, start there. If cost is a concern, DeepSeek and Groq offer strong results at a fraction of the price. Ollama is free but requires local hardware.

4. Create a GitHub App

Your agent needs a GitHub App to commit code and interact with issues.

  1. Go to Settings > Developer settings > GitHub Apps > New GitHub App
  2. Give it your agent's name
  3. Set permissions:
    • Repository > Contents: Read and write
    • Repository > Issues: Read and write
    • Repository > Discussions: Read and write (optional, for social features)
  4. Install it on your forked repo
  5. Note the App ID, Private Key (generate one), and Installation ID
    • Installation ID: visit https://github.com/settings/installations and click your app — the ID is in the URL

5. Set repo secrets

In your fork, go to Settings > Secrets and variables > Actions and add:

SecretDescription
Provider API keyAPI key for your chosen provider (see table in step 3)
APP_IDGitHub App ID
APP_PRIVATE_KEYGitHub App private key (PEM)
APP_INSTALLATION_IDGitHub App installation ID

Set the API key secret matching your chosen provider. For example, if using Anthropic, add ANTHROPIC_API_KEY. If using OpenAI, add OPENAI_API_KEY. If using DeepSeek, add DEEPSEEK_API_KEY, and so on.

6. Enable the Evolution workflow

Go to Actions in your fork and enable the Evolution workflow. Your agent will start evolving on its next scheduled run, or trigger it manually with Run workflow.

What Each File Does

FilePurpose
IDENTITY.mdAgent's constitution — name, mission, goals, rules
PERSONALITY.mdAgent's voice — writing style, personality traits
ECONOMICS.mdWhat money/sponsorship means to the agent
journals/JOURNAL.mdChronological log of evolution sessions (auto-maintained)
DAY_COUNTTracks the agent's current evolution day
memory/Persistent learning system (auto-maintained)
SPONSORS.mdSponsor recognition (auto-maintained)

Costs

Costs vary by provider and model:

  • Anthropic Claude Opus$3-8 per session ($10-25/day at 3 sessions/day)
  • Anthropic Claude Sonnet — ~$1-3 per session, good balance of quality and cost
  • DeepSeek — significantly cheaper, strong coding performance
  • Groq — fast and affordable for smaller models
  • Ollama — free (runs locally), but requires capable hardware

The default schedule runs ~3 sessions per day (8-hour gap between runs). To reduce costs, switch to a cheaper provider/model or reduce session frequency.

Customization

Change the provider and model

Set PROVIDER and MODEL environment variables in .github/workflows/evolve.yml:

env:
  PROVIDER: openai
  MODEL: gpt-4o

Or set just MODEL to use a different model within the default provider (Anthropic):

env:
  MODEL: claude-sonnet-4-6

You can also edit the default directly in scripts/evolve.sh.

Change session frequency

Edit the cron schedule in .github/workflows/evolve.yml. The default 0 * * * * (every hour) is gated by an 8-hour gap in the script, so the agent runs ~3 times/day.

Add custom skills

Create markdown files with YAML frontmatter in the skills/ directory. The agent loads them automatically via --skills ./skills.

The sponsor system auto-detects your GitHub Sponsors. No configuration needed — just set up GitHub Sponsors on your account.

The /update Command

The yoyo binary's /update command checks for releases from yologdev/yoyo-evolve, not your fork. This is expected behavior. As a fork maintainer, rebuild from source after pulling changes:

cargo build --release

In the future, an evolve portal will provide guided setup including custom update targets.

Optional: Dashboard Notifications

If you have a dashboard repo that accepts repository dispatch events, set a repo variable:

gh variable set DASHBOARD_REPO --body "your-user/your-dashboard" --repo your-user/your-fork

And add the DASHBOARD_TOKEN secret with a token that can dispatch to that repo.

Mutation Testing

yoyo uses cargo-mutants to assess test quality. Mutation testing works by making small changes (mutants) to the source code — flipping conditions, replacing return values, removing function bodies — and checking whether any test catches each change.

If a mutant survives (no test fails), it means that line of code isn't actually tested.

Baseline

As of Day 9, yoyo has 1004 total mutants across its source files. This number grows as features are added. The mutation testing setup uses a 20% maximum survival rate threshold — if more than 20% of tested mutants survive, the check fails.

MetricValue
Total mutants1004
Threshold20% max survival rate
EstablishedDay 9 (2026-03-09)

Install cargo-mutants

cargo install cargo-mutants

Quick start with the threshold script

The easiest way to run mutation testing is with the threshold script:

# Run with default 20% threshold
./scripts/run_mutants.sh

# Run with a stricter threshold
./scripts/run_mutants.sh --threshold 10

# Just count mutants without running them
./scripts/run_mutants.sh --list

# Test mutants in a specific file only
./scripts/run_mutants.sh --file src/format.rs

The script:

  1. Runs cargo mutants on the project
  2. Counts caught vs survived mutants
  3. Calculates the survival rate
  4. Exits with code 1 if the rate exceeds the threshold
  5. Prints surviving mutants on failure so you know what to fix

This makes it easy for maintainers to run locally and could be added to CI by the project owner.

Run mutation testing directly

From the project root:

# Run all mutants (this takes a while — several minutes)
cargo mutants

# Show only the surviving mutants (uncaught mutations)
cargo mutants -- --survived

# Run mutants for a specific file
cargo mutants -f src/format.rs

# Run mutants for a specific function
cargo mutants -F "format_cost"

Read the results

After a run, cargo-mutants creates a mutants.out/ directory with detailed results:

# Summary
cat mutants.out/caught.txt     # mutants killed by tests ✓
cat mutants.out/survived.txt   # mutants NOT caught — test gaps!
cat mutants.out/timeout.txt    # mutants that caused infinite loops
cat mutants.out/unviable.txt   # mutants that didn't compile

Focus on survived.txt — each line is a mutation that no test catches. These are the weak spots.

Configuration

The mutants.toml file in the project root excludes known-acceptable mutants:

  • Cosmetic functions — ANSI color codes, banner printing, help text
  • Interactive I/O — functions that read stdin or require a terminal
  • Async API calls — prompt execution that needs a live Anthropic API

These exclusions keep mutation testing focused on logic that should be tested. If you add a new feature with testable logic, make sure it's not excluded.

Writing targeted tests

When you find a surviving mutant:

  1. Read what the mutation does (e.g., "replace < with <= in format_cost")
  2. Write a test that specifically catches that boundary condition
  3. Re-run cargo mutants -F "function_name" to verify the mutant is now caught

Example workflow:

# Find surviving mutants
cargo mutants 2>&1 | grep "SURVIVED"

# Write a test to kill the mutant, then verify
cargo mutants -F "format_cost"

Threshold script for CI

The scripts/run_mutants.sh script is designed to be CI-friendly:

# In a CI pipeline or pre-merge check:
./scripts/run_mutants.sh --threshold 20

# Exit codes:
#   0 = survival rate within threshold (PASS)
#   1 = survival rate exceeds threshold (FAIL)

The project owner can add this to CI workflows when ready. For now, contributors should run it locally before submitting PRs that add new logic.

When to run

Mutation testing is slow — it builds and tests your code once per mutant. Run it:

  • After adding a new feature, to verify test coverage
  • Before a release, as a quality check
  • When you suspect the test suite has gaps
  • On specific files with --file to keep it fast during development

Notes for CI integration

The scripts/run_mutants.sh script and mutants.toml config are ready for a human maintainer to wire into CI. A few things to know:

  • Git-dependent tests: Some tests (e.g. test_git_branch_returns_something_in_repo, test_build_project_tree_runs, test_get_staged_diff_runs) gracefully handle running outside a git repo. cargo-mutants copies source to a temp directory without .git/, so these tests skip git-specific assertions when not in a repo.
  • Exclusions are reasonable: The mutants.toml excludes cosmetic/display functions (ANSI colors, banners), interactive I/O (stdin, terminal), and async API calls (needs live Anthropic key). These can't be meaningfully unit-tested.
  • The script cannot be added to .github/workflows/ by the agent (safety rules), but it exits with code 0/1 and is designed for CI use.

Common Issues

"No API key found"

error: No API key found.
Set ANTHROPIC_API_KEY or API_KEY environment variable.

Fix: Set your Anthropic API key:

export ANTHROPIC_API_KEY=sk-ant-api03-...

yoyo checks ANTHROPIC_API_KEY first, then API_KEY. At least one must be set and non-empty.

"No input on stdin"

No input on stdin.

This happens when you pipe empty input to yoyo:

echo "" | yoyo

Fix: Make sure your piped input contains actual content.

Model errors

  error: [API error message]

This appears when the Anthropic API returns an error. Common causes:

  • Invalid API key — check your key is correct and active
  • Rate limiting — you're sending too many requests; wait and retry
  • Model unavailable — the model you specified doesn't exist or you don't have access

Automatic retry: yoyo automatically retries transient errors (rate limits, server errors, network issues) with exponential backoff — up to 3 retries with 1s, 2s, 4s delays. You'll see a dim message like ⚡ retrying (attempt 2/4, waiting 2s)... when this happens. Auth errors (401, 403) and invalid requests (400) are shown immediately without retrying.

Tool error auto-recovery: When a tool execution fails during a natural-language prompt, yoyo automatically retries the prompt with error context appended (up to 2 times). This lets the agent self-correct — for example, retrying a failed file read with a corrected path. You'll see ⚡ auto-retrying after tool error... when this kicks in.

Use /retry to manually re-send the last prompt after a non-transient error is resolved.

Context window full

    ⚠ Context is getting full. Consider /clear or /compact.

Your conversation is approaching the 200,000-token context limit.

Fix: Use /compact to compress the conversation, or /clear to start fresh.

yoyo auto-compacts at 80% capacity, but you can compact earlier if you prefer.

Auto-recovery from overflow: If the API returns a context overflow error (e.g., "prompt is too long"), yoyo automatically compacts the conversation and retries the prompt once. You'll see:

  ⚡ context overflow detected — auto-compacting and retrying...

This handles the case where the context grows past the limit mid-conversation without you noticing. If the retry also fails, yoyo suggests using /compact manually.

"warning: Failed to load skills"

warning: Failed to load skills: [error]

The --skills directory couldn't be read. yoyo continues without skills.

Fix: Check that the path exists and contains valid skill files.

"unknown command: /foo"

  unknown command: /foo
  type /help for available commands

You typed a command yoyo doesn't recognize. If it's a typo, yoyo will suggest the closest match:

  unknown command: /hlep
  did you mean /help?
  type /help for available commands

Fix: Check the suggestion, or type /help to see all available commands.

"not in a git repository"

  error: not in a git repository

You used /diff or /undo outside a git repo.

Fix: Navigate to a directory that's inside a git repository before starting yoyo.

Ctrl+C behavior

  • First Ctrl+C — cancels the current response; you can type a new prompt
  • Second Ctrl+C (or Ctrl+D) — exits yoyo

If a tool execution is hanging, Ctrl+C will abort it.

Session file errors

  error saving: [error]
  error reading yoyo-session.json: [error]
  error parsing: [error]

Session save/load failed. Common causes:

  • Disk full — free space and try again
  • Permission denied — check file permissions
  • Corrupt file — delete the session file and start fresh

Safety & Anti-Crash Guarantees

How does a coding agent that edits its own source code avoid breaking itself?

Good question. yoyo has six layers of defense — from the innermost loop (every single code change) to the outermost (protected files that can never be touched). Here's how each one works.

Layer 1: Build-and-test gate on every commit

No code change is ever committed unless it passes:

cargo build && cargo test

This happens inside the evolution session itself. The agent runs the build and test suite after every edit. If either fails, the change doesn't get committed — the agent reads the error and tries to fix it.

Layer 2: CI on every push

Even after the agent commits locally, GitHub Actions runs the full check suite on every push to main:

cargo build
cargo test
cargo clippy --all-targets -- -D warnings
cargo fmt -- --check

Clippy warnings are treated as errors (-D warnings), so even subtle issues like unused variables or redundant clones get caught. If CI fails, the next evolution session sees the failure and prioritizes fixing it before doing anything else.

Layer 3: Automatic revert on build failure

The evolution script (evolve.sh) has a post-session verification step. After all tasks run, it re-checks the build. If it fails:

  1. It gives the agent up to 3 attempts to fix the errors automatically
  2. If all fix attempts fail, it reverts to the pre-session state:
    git checkout "$SESSION_START_SHA" -- src/
    

This means a broken session can never leave src/ in a worse state than it started. The revert is surgical — it only touches source files, preserving journal entries and other non-code changes.

Layer 4: Tests before features

yoyo's evolve skill requires writing a test before adding a feature. This isn't just a guideline — the planning phase explicitly instructs each implementation task to "write a test first if possible."

Why this matters: if you write the test first, you know the test covers the new behavior. If you write the feature first, you might write a test that only confirms what you already built, missing edge cases.

Layer 5: No deleting existing tests

The evolve skill has a hard rule: never delete existing tests. Tests are the agent's immune system. Removing them would let regressions slip through silently. As of this writing, yoyo has 91+ tests, and that number only goes up.

Layer 6: Protected files

Some files are simply off-limits. The agent cannot modify:

FileWhy it's protected
IDENTITY.mdyoyo's constitution — defines who it is and its core rules
PERSONALITY.mdyoyo's voice and values
scripts/evolve.shThe evolution loop itself — if this broke, recovery would be manual
scripts/format_issues.pyInput sanitization for GitHub issues
scripts/build_site.pyWebsite builder
.github/workflows/*CI configuration — the safety net that catches everything else

These files can only be changed by human maintainers. This prevents a subtle failure mode: the agent "improving" its own safety checks in a way that weakens them.

What happens in practice

A typical evolution session:

  1. evolve.sh verifies the build passes before starting
  2. The planning agent reads source code, journal, and issues
  3. Implementation agents execute tasks, each running build+test after changes
  4. Post-session verification re-checks everything
  5. If anything broke, automatic fix attempts kick in
  6. If fixes fail, revert to pre-session state
  7. CI runs on push as a final backstop
  8. Next session checks CI status — failures get top priority

The result: yoyo has been evolving autonomously since Day 0, growing from ~200 lines to ~3,100+ lines, without ever shipping a broken build to main.

Can it still break?

Theoretically, yes. Safety is defense-in-depth, not a proof of correctness. Some scenarios the current system doesn't catch:

  • Logic bugs that pass tests — if the test suite doesn't cover a behavior, the agent could change it without noticing
  • Performance regressions — we rely on official leaderboards (SWE-bench, etc.) rather than custom benchmarks
  • Subtle UX regressions — the agent tests functionality, not user experience

These are areas for future improvement. But for the core guarantee — "the agent won't commit code that doesn't compile or pass tests" — the six layers above make that extremely unlikely.