# Transformer Feed-Forward Network

> Source: https://sukruyusufkaya.com/en/glossary/transformer-feed-forward-network
> Updated: 2026-05-13T20:04:27.738Z
> Type: glossary
> Category: derin-ogrenme
**TLDR:** A Transformer sub-block that operates independently on each token and strengthens representation transformation.

<p>The feed-forward network inside a Transformer provides token-wise nonlinear transformation that attention alone does not supply. It typically consists of two linear layers and an activation function. Although it operates independently on each token, it contributes a major portion of the model’s overall capacity. In large language models, a substantial share of parameters resides in this substructure.</p>