Sigmoid Activation

A classical activation function that squashes input values into the range between 0 and 1.

The sigmoid activation has historically been used heavily, especially in output layers where a binary probability interpretation is needed. It maps inputs smoothly into the interval between 0 and 1, making it suitable for probability-like outputs. However, the shrinking of gradients in saturation regions can worsen the vanishing-gradient problem in deep networks. For that reason, it is now used mainly in output layers and in specialized probabilistic settings.