Skip to content
Back to full roadmap
topicadvanced

Reasoning Models (Thinking)

o1, Claude 'thinking', Gemini 2.5 — model thinks in an invisible 'scratchpad'.

3 hours2 resources1 prereqs

New 'reasoning' models internalized CoT. Even without explicit CoT, the model runs a parallel scratchpad, then produces the final answer.

Practical differences:

  • High latency (10-60 sec)
  • Higher cost (thinking tokens are also billed)
  • 10-30× accuracy gains on math/code/planning
  • Overkill for simple tasks — Haiku/4o-mini suffices

Anthropic 'extended thinking': provide a thinking_tokens budget (e.g. 16K) — model thinks that much. include_thoughts: true for debugging.

Prerequisites

Resources(2)