# Gradient Noise Scale

> Source: https://sukruyusufkaya.com/en/glossary/gradient-noise-scale
> Updated: 2026-05-13T20:57:49.240Z
> Type: glossary
> Category: derin-ogrenme
**TLDR:** A training-dynamics measure that characterizes how noisy gradient estimates are in stochastic optimization.

<p>The gradient noise scale helps characterize how stable the update signal is during mini-batch learning. Very small batches may produce noisier gradients that sometimes generalize better, while large batches can behave more deterministically. This concept is used to study the relationship between batch size and optimization efficiency in a more principled way. It has become increasingly important in large-scale training systems.</p>