Glossary Library

Technical GlossaryDeep Learning

Multi-Head Attention

A structure that runs attention in parallel across multiple subspaces to learn different types of relationships.

Multi-head attention enables the model to learn multiple relationship patterns simultaneously instead of relying on a single attention map. Some heads may focus on local context, others on long-range dependencies, and still others on different semantic structures. This multiplicity significantly increases the representational power of Transformer models. It has become a standard architectural component in modern language and multimodal systems.

You Might Also Like

Explore these concepts to continue your artificial intelligence journey.

Glossary Cover

yapay-zeka-temelleri

Deep Learning

A modern machine learning approach that learns hierarchical representations from data using multi-layer neural networks.

Glossary Cover

yapay-zeka-temelleri

Representation Learning

An approach in which informative, discriminative, and task-relevant internal representations are learned automatically from raw data.

Glossary Cover

matematik-istatistik-optimizasyon

Tensor

A multidimensional numerical structure that generalizes scalars, vectors, and matrices.