# Speculative Decoding

> Source: https://sukruyusufkaya.com/en/glossary/speculative-decoding
> Updated: 2026-05-13T19:58:55.878Z
> Type: glossary
> Category: uretken-yapay-zeka-ve-llm
**TLDR:** A decoding approach that speeds up generation by validating proposals from a smaller fast model with a larger model.

<p>Speculative decoding is one of the innovative inference techniques developed to reduce LLM generation latency. A smaller model proposes several tokens, and the larger model then accepts or rejects them in batches. When designed well, it can deliver meaningful speed gains while preserving quality to a large extent.</p>