Glossary Library

Technical GlossaryDeep Learning

Sparse Attention

An attention approach that reduces cost by allowing each element to attend only to selected regions rather than the full sequence.

Sparse attention was developed to reduce the quadratic complexity of standard self-attention. In long-context tasks, letting every token attend to every other token becomes expensive, so only selected attention patterns are used. This is important in long-document modeling, genomic data, and large-context language models. It creates new trade-offs between computational efficiency and representational richness.

You Might Also Like

Explore these concepts to continue your artificial intelligence journey.

Glossary Cover

veri-bilimi-ve-veri-yonetimi

Active Labeling

An approach that aims to optimize labeling cost by selecting the most useful or uncertain examples for annotation.

Glossary Cover

makine-ogrenmesi

Random Search

An optimization approach that searches for effective combinations by randomly sampling the hyperparameter space.

Glossary Cover

makine-ogrenmesi

Successive Halving

An optimization approach that evaluates model candidates in stages, eliminates weak ones, and allocates resources to the strongest candidates.