Concept

Vector-Jacobian Product via the Gradient Argument

Deep learning frameworks differ in how they process gradients for non-scalar tensors. In PyTorch, invoking automatic differentiation on a non-scalar output raises an error unless a reduction vector, v\mathbf{v}, is provided. This vector is passed through the gradient argument, instructing the framework to compute the vector-Jacobian product vopxy\mathbf{v}^ op \partial_{\mathbf{x}} \mathbf{y} rather than the full Jacobian matrix xy\partial_{\mathbf{x}} \mathbf{y}.

0

1

Updated 2026-05-02

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L