Back to full roadmap
topicadvanced
Reasoning Models (Thinking)
o1, Claude 'thinking', Gemini 2.5 — model thinks in an invisible 'scratchpad'.
3 hours2 resources1 prereqs
New 'reasoning' models internalized CoT. Even without explicit CoT, the model runs a parallel scratchpad, then produces the final answer.
Practical differences:
- High latency (10-60 sec)
- Higher cost (thinking tokens are also billed)
- 10-30× accuracy gains on math/code/planning
- Overkill for simple tasks — Haiku/4o-mini suffices
Anthropic 'extended thinking': provide a thinking_tokens budget (e.g. 16K) — model thinks that much. include_thoughts: true for debugging.