Technical GlossaryDeep Learning
Layer Normalization
A technique that normalizes activations at the sample level and provides more stable training especially in sequence models.
Layer normalization normalizes activations within each sample rather than across the batch. This makes it more suitable than batch normalization in RNNs, Transformers, and small-batch training scenarios. It improves training stability and can help gradient behavior in deep architectures. It has become one of the core building blocks of modern Transformer design.
You Might Also Like
Explore these concepts to continue your artificial intelligence journey.
