Learn Before
Formula
Gradient of Objective Function with Respect to Output Layer Variable
To compute the gradient of the objective function with respect to the output layer variable , the chain rule is applied through the loss term . Because the gradient of with respect to is , the formula simplifies directly to the partial derivative of the loss with respect to the output:
0
1
Updated 2026-05-06
Tags
D2L
Dive into Deep Learning @ D2L