Back to full roadmap
topiccore
Agent Context Window Management
Long trajectories blow up context — summary, pruning, sliding window are mandatory.
2 hours1 resources
If an agent loop runs 20 steps, context can grow from 50K → 200K tokens with all tool results. That's expensive AND causes lost-in-the-middle.
Strategies:
- Sliding window: keep only last N messages; summarize older ones.
- Tool result summarization: auto-summarize tool outputs over 500 tokens.
- Scratchpad pattern: model writes long reasoning to a
<scratchpad>, only the summary stays in context. - Periodic compression: every 5 turns, summarize the conversation, drop the old.
- Retrieval over context: save past observations to a vector DB, retrieve on demand.
Anthropic Memory Tool (2025): give Claude "remember X" and "recall X" tools — persistent memory automatic.