Technical GlossaryDeep Learning

Linear Attention

An approach that aims to make attention computation more scalable by reducing complexity to an approximately linear form.

Linear attention methods were developed to overcome the cost of standard attention in very long sequences. Through ideas such as kernelization or restructured computation, they try to achieve similar effects without explicitly forming the full attention matrix. This has attracted strong interest in domains such as long-context text, video, and genomic sequences. It is one of the central concepts in scalable Transformer research.