When AI's "Brain" Isn't Enough——Compaction and Microcompact

Table of Contents
- Two Types of Compaction: Auto-compaction vs Microcompact
- Auto-compaction: When It’s Triggered
- Microcompact: Precise Trimming
- State Preservation After Compaction: What Remembers What
- State Recovery: Reload When Needed
- Context Collapse: The Last Defense
- Practical: Coping with “Amnesia”
- Compaction Quality Control
- Implications for Building AI Agents
- Summary
Have you experienced this: after chatting with Claude Code for a long time, you ask it to recall something discussed earlier, and it says “I’m not sure” or “could you remind me?”
This isn’t AI getting dumber—its “brain” has run out of space—the context window is full, and old content has been compacted.
Today we’re talking about how Claude Code makes tradeoffs when “memory runs out” and what remains after compaction.
The diagram: Context compaction is like packing when moving, keeping labels in boxes
Two Types of Compaction: Auto-compaction vs Microcompact
Claude Code has two compaction mechanisms for different scenarios:
Auto-compaction (Compaction): Large, automatically triggered compaction. When context approaches its limit, the system automatically identifies compressible content and executes compaction.
Microcompact: Precise, on-demand compaction. Targeted trimming of individual oversized content (like large files, large tool results).
Think of it this way:
- Auto-compaction is like packing all house furniture into boxes—one big move, wide scope
- Microcompact is like organizing drawer contents—one small move, specific target
Auto-compaction: When It’s Triggered
Auto-compaction has clear triggering conditions:
Token Threshold: When context exceeds a certain percentage (like 80%), trigger compaction evaluation.
Message Count: When historical messages exceed a certain number, consider compacting early messages.
Tool Result Accumulation: When accumulated tool results are large, trigger summarization.
User Explicit Request: Users can trigger compaction via command.
Compaction flow:
Detect Trigger → Identify Compressible Content → Generate Summary → Replace Original → Preserve References
The diagram: Complete auto-compaction flow
What gets auto-compacted?
Old Tool Results: Tool results from early conversation turns, if not referenced later.
Code Blocks in History: Processed code, keeping only key parts.
Redundant Dialogue: Repeated confirmations, casual chat.
Outdated State: Information already overwritten by subsequent operations.
Microcompact: Precise Trimming
Microcompact targets individual content items, performing precise trimming when they’re too large.
Common scenarios:
Large File Reads: FileReadTool read a 10MB log file, but the model only needs a few relevant lines.
Large Search Results: Grep returned 2000 matches, but the model only cares about the first 100.
Long Output Truncation: BashTool command produced massive output, exceeding maxResultSizeChars.
Microcompact strategies:
Preserve Head and Tail: Beginning and end of files are usually most important.
Sampling Preservation: Sample-preserve middle sections, maintaining some representativeness.
Relevance Filtering: Keep only content relevant to current task.
Generate Summaries: Summarize content with shorter text.
State Preservation After Compaction: What Remembers What
Key question: after content is compacted, what does the model still “remember”?
Claude Code uses “lossy compression”—preserving key information, discarding details.
Preserved Information:
- Existence: “I read this file,” “I used this tool”
- Metadata: File paths, operation times, content types
- Key Conclusions: Processing results, discovered problems, made decisions
- Relevance: Which content relates to current task
- Reference Links: Where to recover detailed content if needed
Discarded Information:
- Complete Content: Specific file text, complete tool outputs
- Intermediate Process: Detailed thinking process, tried methods
- Outdated Information: Content already overwritten
- Redundant Data: Repeated information
It’s like your memory: you might remember “yesterday I read an article about Rust, about ownership,” but not necessarily every sentence—if needed, you can go back and read again.
State Recovery: Reload When Needed
Compaction isn’t deletion—it’s “archiving.” When the model needs compacted content, it can be reloaded.
Ways to reload:
File Reread: If compacted content is file content, use FileReadTool to reread.
Tool Re-execution: If compacted content is tool results, re-execute the tool (but consider costs).
Summary Reference: Sometimes summaries are sufficient, no need for full content.
User Reminder: If AI seems to have forgotten something, users can proactively remind.
This design lets AI handle massive information without “amnesia” in long conversations.
Context Collapse: The Last Defense
When even after compaction the context still doesn’t fit, Claude Code deploys its final move: Context Collapse.
It’s like when moving and you really can’t fit everything—you have to throw away some boxes, keeping only the most important.
Collapse strategy:
Preserve Core:
- System prompts (AI’s “identity”)
- Recent N turns of conversation
- Key decision points
- Explicit user requirements
May Discard:
- Earliest turns of conversation
- Completed subtasks
- Detailed process records
Generate Summaries:
- Generate high-level summaries of discarded content
- Preserve key conclusions
- Discard process details
After collapse, the model may “forget” early details but still retains the impression “we discussed X earlier.”
Practical: Coping with “Amnesia”
Understanding compaction mechanisms helps you cope with AI’s “forgetting”:
Proactive Reminders: If AI seems to have forgotten earlier content, proactively remind it: “We talked about using React earlier, remember?”
Put Key Info in CLAUDE.md: Project-level key information goes in CLAUDE.md—it gets priority preservation.
Process in Segments: If a task is very complex, divide it into multiple subtasks to avoid context explosion.
Use References: If AI mentions “I read a file earlier” but can’t remember content, ask it to reread.
Understand It’s Not a Bug: “Forgetting” in long conversations is a normal mechanism, not an AI problem.
Compaction Quality Control
Claude Code tries to preserve “important” content during compaction, but what counts as “important”?
User Explicitly Emphasized: “This is important, remember”
Decision Points: Choices made by users, confirmed approaches
Errors and Lessons: Failed experiences, learned lessons
Key Data: Configuration values, parameters, important code snippets
Current Task Relevance: Content directly related to current goals
This “importance assessment” has the model itself participating—it decides what to keep, what to discard.
Implications for Building AI Agents
If you want to build your own AI Agent, context compaction is key:
Design Compaction Strategy: Define what can be compacted, what must be preserved.
Preserve Metadata: Even when compacting content, preserve “existence” and “references.”
Support State Recovery: Provide ability to reload compacted content.
Let Users Participate: For important content, ask users “should this be preserved?”
Monitor Compaction Effects: Provide before/after comparisons so users understand what happened.
Summary
When AI’s “brain” isn’t enough, Claude Code elegantly makes tradeoffs through auto-compaction, microcompact, and context collapse.
Key mechanisms:
- Auto-compaction: Automatic summarization of large content blocks
- Microcompact: Precise trimming of individual content
- State Preservation: Preserve key information, discard details
- Context Collapse: Last defense, keeping only the most important
Understanding this helps you:
- Understand AI’s “forgetting” behavior
- Manage long conversations more effectively
- Implement similar mechanisms in your own AI Agent
In the next article, we’ll talk about the permission system—installing a “safety brake” for AI.
