# Sparse Attention

> Source: https://sukruyusufkaya.com/en/glossary/sparse-attention
> Updated: 2026-05-13T20:01:41.930Z
> Type: glossary
> Category: derin-ogrenme
**TLDR:** An attention approach that reduces cost by allowing each element to attend only to selected regions rather than the full sequence.

<p>Sparse attention was developed to reduce the quadratic complexity of standard self-attention. In long-context tasks, letting every token attend to every other token becomes expensive, so only selected attention patterns are used. This is important in long-document modeling, genomic data, and large-context language models. It creates new trade-offs between computational efficiency and representational richness.</p>