# Speculative Decoding > Source: https://sukruyusufkaya.com/en/glossary/speculative-decoding > Updated: 2026-05-23T23:37:44.472Z > Type: glossary > Category: uretken-yapay-zeka-ve-llm **TLDR:** A decoding approach that speeds up generation by validating proposals from a smaller fast model with a larger model.

Speculative decoding is one of the innovative inference techniques developed to reduce LLM generation latency. A smaller model proposes several tokens, and the larger model then accepts or rejects them in batches. When designed well, it can deliver meaningful speed gains while preserving quality to a large extent.