Skip to content

Chain Rule

A rule that computes the derivative of a composite function through the derivatives of its inner and outer functions.

The chain rule lies at the mathematical heart of deep learning. When a function is built as a composition of other functions, the chain rule is used to calculate how changes propagate through the entire structure. Since each layer in a neural network depends on the output of the previous one, backpropagation relies directly on this rule. For that reason, the chain rule is not just a calculus concept; it is one of the key mechanisms that makes training large models possible. To understand deep networks, one must understand the chain rule.