Technical GlossaryDeep Learning
Decoder-Only Transformer
A modern large-language-model architecture that generates autoregressively by predicting the next token.
The decoder-only Transformer forms the basis of most modern large language models. It generates autoregressively by predicting the next token based on prior context. This creates a strong alignment between a simple training objective and large-scale pretraining. It has become the dominant architecture in text generation, code generation, and general-purpose language modeling.
You Might Also Like
Explore these concepts to continue your artificial intelligence journey.
