Learn Before
Formula
Gradient of RNN Objective with Respect to Output Weights
For a recurrent neural network, the gradient of the objective function with respect to the output layer weight parameter is calculated by summing the gradients across all time steps . Because the objective function depends on via the sequence of outputs , we apply the chain rule to obtain:
where is the transpose of the hidden state at time step , and is the gradient of the objective with respect to the model output at that time step.
0
1
Updated 2026-05-14
Contributors are:
Who are from:
Tags
Data Science
D2L
Dive into Deep Learning @ D2L