# FlashAttention: IO-Aware Attention — Dao 2022 Algorithm and Modern Implementations

> Source: https://sukruyusufkaya.com/en/learn/llm-muhendisligi/flashattention-dao-2022-io-aware-attention
> Updated: 2026-05-13T13:00:27.650Z
> Category: LLM Mühendisliği
> Module: Module 8: Attention Mathematics — The Heart of Transformer
**TLDR:** Mathematical and systems anatomy of FlashAttention: why standard attention is memory-bound, GPU memory hierarchy (HBM vs SRAM), tile-based computation, online softmax, recomputation backward. Evolution of FlashAttention-1 (Dao 2022), FlashAttention-2, FlashAttention-3. PyTorch flash_attn library, performance benchmarks, long context enablement.

