Formula

Gradient of Objective Function with Respect to Output Layer Variable

To compute the gradient of the objective function JJ with respect to the output layer variable oRq\mathbf{o} \in \mathbb{R}^q, the chain rule is applied through the loss term LL. Because the gradient of JJ with respect to LL is 11, the formula simplifies directly to the partial derivative of the loss with respect to the output: Jo=prod(JL,Lo)=Lo\frac{\partial J}{\partial \mathbf{o}} = \textrm{prod}\left(\frac{\partial J}{\partial L}, \frac{\partial L}{\partial \mathbf{o}}\right) = \frac{\partial L}{\partial \mathbf{o}}

0

1

Updated 2026-05-06

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L