title: "Extended Context & ECC" description: "How large context windows change agent behavior, token economics, and what ECC means for agentic workflows" section: "Ecosystem" readTime: "10 min"

Extended Context & ECC

Context window size determines what an agent can "see" at once. Understanding how context works — and how to use it efficiently — is one of the highest-leverage skills for AI-assisted development.

What Is the Context Window?

The context window is the total amount of text (measured in tokens) that the model can process in a single request. Everything in the window — your conversation history, files, instructions, tool results — competes for this space.

Rough token estimates:

Content	Approximate Tokens
1 line of code	5–15 tokens
1 file (100 lines)	500–1,500 tokens
Full README	1,000–5,000 tokens
Small codebase (20 files)	10,000–30,000 tokens
Large codebase (200 files)	100,000–300,000 tokens

Context Windows in 2025

Model	Context Window
Claude Sonnet 4.5 / Opus 4	200,000 tokens (~150K words)
GPT-4o	128,000 tokens
Gemini 2.5 Pro	1,000,000 tokens (~750K words)
Gemini 2.5 Flash	1,000,000 tokens
DeepSeek V3	64,000 tokens

Extended Context Caching (ECC)

Extended Context Caching (ECC) is Anthropic's mechanism for reusing prompt prefixes across requests. When a large portion of your prompt stays the same between calls (e.g., a large codebase), ECC stores it on Anthropic's servers and charges a much lower cache read price instead of the full input price.

ECC Cost Breakdown

Operation	Price vs. Normal Input
Cache write (first call)	25% more expensive
Cache read (subsequent calls)	90% cheaper

Example savings: Loading a 50,000-token codebase context into Claude for 10 agent iterations:

Without ECC: 500,000 tokens × input price
With ECC: 50,000 tokens cache write + 450,000 tokens cache read → ~80% cheaper

Enabling ECC in Claude Code

ECC is enabled by default in Claude Code for CLAUDE.md and imported files. For custom usage:

// Direct API usage with ECC
const response = await anthropic.messages.create({
  model: "claude-sonnet-4-5",
  max_tokens: 4096,
  system: [
    {
      type: "text",
      text: largeCodebaseContext,
      cache_control: { type: "ephemeral" }  // Enable ECC for this block
    }
  ],
  messages: [{ role: "user", content: userPrompt }]
});

CLAUDE.md and ECC

Your CLAUDE.md file is always included in the context prefix. With ECC enabled, it's cached after the first call and reused cheaply for the entire session.

Keep your CLAUDE.md comprehensive — the ECC savings make it nearly free after the first read.

How Context Affects Agent Behavior

The "Lost in the Middle" Problem

Models process context unevenly. Information at the start and end of the context window gets the most attention. Information buried in the middle of a long context is more likely to be ignored.

[High attention]  System prompt, CLAUDE.md
[Medium attention] Middle of conversation
[Low/medium attention] Code files added mid-session
[High attention]  Most recent message

Mitigation: Put critical constraints in the system prompt (CLAUDE.md), not as mid-conversation notes.

Context Compaction in Claude Code

When the context fills up, Claude Code automatically compacts it: it summarizes earlier conversation turns into a dense summary, freeing space for new content. You can observe this with:

# Show context usage
claude --verbose "your task here"

When compaction happens, some details from earlier in the session may be lost. For long sessions:

# Checkpoint: ask Claude to summarize its progress before compaction
claude "Summarize what we've accomplished and what files have been changed, 
then continue with the next step."

Optimizing Context Usage

What to Include

Content	Include?	Why
CLAUDE.md / project context	Always	Free after ECC; critical for accuracy
Directly relevant source files	Yes	Enable accurate edits
Test files for the module	Yes	Ground truth for behavior
Unrelated modules	No	Wastes context, increases cost
Entire node_modules	Never	Millions of tokens, no value

Use `@file` Sparingly in Cursor

Cursor's @file reference includes the full file content in context. For large files:

# Inefficient — includes entire 3,000-line router file
@src/routes/index.ts Fix the auth middleware

# Better — scope to relevant function
@src/routes/auth.ts lines 45-89 Fix the token validation

Context Pruning in Claude Code

# Start a fresh session for unrelated tasks (don't carry old context)
# Each `claude` invocation starts fresh unless you use --resume
 
# Resume a specific session
claude --resume <session-id>
 
# List recent sessions
claude sessions list

Large Codebase Strategies

Repo Map (aider approach)

Rather than loading all files, load a "map" — just the file paths, class names, and function signatures:

# Generate a repo map
find src/ -name "*.ts" | xargs grep -l "export" | \
  xargs grep "^export (function|class|const|interface)" | \
  sed 's/:export.*//' > /tmp/repo-map.txt
 
claude "Using this repository map for context, find which modules handle authentication:
$(cat /tmp/repo-map.txt)"

Chunking for Gemini 2.5

With Gemini's 1M token window, you can genuinely fit entire large repos:

# Concatenate entire src/ tree
find src/ -name "*.ts" -o -name "*.tsx" | sort | \
  xargs -I {} bash -c 'echo "=== {} ==="; cat {}' > /tmp/full-context.ts
 
# Feed to Gemini API
cat /tmp/full-context.ts | gemini-cli "Analyze all inter-module dependencies 
and identify the top 5 architectural coupling issues"

Checklist

CLAUDE.md is comprehensive — ECC makes it nearly free per sessionCritical constraints placed at start of system prompt, not mid-conversationFresh sessions started for unrelated tasks (don't carry irrelevant history)Large irrelevant files excluded from @file referencesLong sessions checkpointed before compaction removes key detailsECC enabled via cache_control when calling Claude API directly

Extended Context & ECC

title: "Extended Context & ECC" description: "How large context windows change agent behavior, token economics, and what ECC means for agentic workflows" section: "Ecosystem" readTime: "10 min"

Extended Context & ECC

What Is the Context Window?

Context Windows in 2025

Extended Context Caching (ECC)

ECC Cost Breakdown

Enabling ECC in Claude Code

CLAUDE.md and ECC

How Context Affects Agent Behavior

The "Lost in the Middle" Problem

Context Compaction in Claude Code

Optimizing Context Usage

What to Include

Use @file Sparingly in Cursor

Context Pruning in Claude Code

Large Codebase Strategies

Repo Map (aider approach)

Chunking for Gemini 2.5

Checklist

Use `@file` Sparingly in Cursor