Learn Before
Concept

Gradients of Tied Parameters

When parameters are tied across multiple layers in a neural network, they share the exact same underlying tensor. During backpropagation, the gradients computed for each specific instance of the shared layer are summed together. Because the single parameter tensor must account for the error propagated through all the distinct layers where it was applied during the forward pass, these individual layer gradients are added to correctly determine the final gradient for the tied parameters.

0

1

Updated 2026-05-08

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L