# GELU Activation

> Source: https://sukruyusufkaya.com/en/glossary/gelu-activation
> Updated: 2026-05-13T20:58:37.599Z
> Type: glossary
> Category: derin-ogrenme
**TLDR:** A modern activation function that transforms inputs with probabilistic smoothness rather than a hard threshold.

<p>GELU is a modern activation function that became especially common in Transformer-based models. Rather than applying a hard threshold like ReLU, it scales inputs in a smoother way, which can lead to more stable learning behavior in some architectures. It is frequently found in large language models and advanced attention-based systems. Although slightly more complex computationally, it is often preferred because of its performance benefits.</p>