Learn Before
Concept
Gradients of Tied Parameters
When parameters are tied across multiple layers in a neural network, they share the exact same underlying tensor. During backpropagation, the gradients computed for each specific instance of the shared layer are summed together. Because the single parameter tensor must account for the error propagated through all the distinct layers where it was applied during the forward pass, these individual layer gradients are added to correctly determine the final gradient for the tied parameters.
0
1
Updated 2026-05-08
Tags
D2L
Dive into Deep Learning @ D2L