Formula
Gaussian Attention Kernel
The Gaussian kernel for attention pooling is defined by the formula . It is a translation and rotation invariant kernel that assigns smoothly decaying weights to observations based on their distance from the origin.
0
1
Updated 2026-05-14
Tags
D2L
Dive into Deep Learning @ D2L