# Custom GPU Kernels with Triton: Softmax, Matmul, FlashAttention Mini from Scratch

> Source: https://sukruyusufkaya.com/en/learn/llm-muhendisligi/triton-custom-gpu-kernels-flashattention
> Updated: 2026-05-13T13:00:25.618Z
> Category: LLM Mühendisliği
> Module: Module 5: PyTorch Engineering — Engineer-Grade
**TLDR:** Triton's secret of GPU programming with Python syntax: programming model (program_id, block_size, autotune), softmax kernel from scratch, matmul tiling, FlashAttention's block-wise mini implementation, performance tuning. Practical foundation for Module 37 (CUDA/Triton deep dive).

