Concept

Initial State of Parameter Gradients Before Backpropagation

In deep learning frameworks, each parameter object allows users to access its gradient in addition to its underlying numerical value. Because a parameter's gradient is only computed when backpropagation is invoked, accessing the gradient prior to this will return its initial state. Depending on the framework, this uncomputed initial state might be represented as None (e.g., using .grad in PyTorch) or as an array of zeros (e.g., using .grad() in MXNet).

0

1

Updated 2026-05-08

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L