Formula

Gradient of Objective Function with Respect to Hidden Layer Output

To continue backpropagation backwards towards the input layer, we calculate the gradient of the objective function JJ with respect to the hidden layer output vector hRh\mathbf{h} \in \mathbb{R}^h. By applying the chain rule through the output layer variable o\mathbf{o}, we obtain: Jh=prod(Jo,oh)=W(2)Jo\frac{\partial J}{\partial \mathbf{h}} = \textrm{prod}\left(\frac{\partial J}{\partial \mathbf{o}}, \frac{\partial \mathbf{o}}{\partial \mathbf{h}}\right) = {\mathbf{W}^{(2)}}^\top \frac{\partial J}{\partial \mathbf{o}} This operation successfully propagates the error gradient backward by multiplying it with the transpose of the output layer's weight matrix.

0

1

Updated 2026-05-06

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L