topicfoundation

Context Window

The total tokens the model can hold in one turn. Must-know.

2 hours2 resources1 prereqs

Each model has a context limit: GPT-4o ≈ 128K, Claude 4.7 (1M) → 1M, Gemini 2.x ≈ 2M. Exceeding it triggers a rejection or truncation.

Important nuance: Long context suffers from the "lost in the middle" effect — info at the beginning and end is recalled best, the middle fades. So:

Place critical instructions at the start AND end
Add structure (XML / markdown sections) in long contexts
Split into multi-turn or RAG when needed

Prerequisites

Tokenization

Token = the atomic unit the model sees. Token count = cost + context consumption.

→

Resources(2)

PPaper(1)

Lost in the Middle — Liu et al.

Nelson Liu et al.· en

free

DDocs(1)

Anthropic — Long context tips

Open the full interactive roadmap