# Post-Norm Transformer

> Source: https://sukruyusufkaya.com/en/glossary/post-norm-transformer
> Updated: 2026-05-13T21:00:58.243Z
> Type: glossary
> Category: derin-ogrenme
**TLDR:** The classical Transformer variant that applies normalization after the attention or FFN block.

<p>The post-norm Transformer reflects the arrangement used in the original Transformer design. While it works well in some tasks, it can introduce optimization-stability challenges in very deep models. The rise of pre-norm designs made the practical consequences of this difference more visible. Still, comparing the two remains important for understanding architectural behavior.</p>