Technical GlossaryDeep Learning
GELU Activation
A modern activation function that transforms inputs with probabilistic smoothness rather than a hard threshold.
GELU is a modern activation function that became especially common in Transformer-based models. Rather than applying a hard threshold like ReLU, it scales inputs in a smoother way, which can lead to more stable learning behavior in some architectures. It is frequently found in large language models and advanced attention-based systems. Although slightly more complex computationally, it is often preferred because of its performance benefits.
You Might Also Like
Explore these concepts to continue your artificial intelligence journey.
