Learn Before
Concept

Visualization Tensor Shape for Attention Weights

To visualize attention weights across different queries and keys simultaneously using heatmaps, the weights are structured as a four-dimensional tensor. The specific shape of this tensor is defined as (number of rows for display, number of columns for display, number of queries, number of keys). This multidimensional format allows an array of different query-key interactions to be organized and plotted within a grid layout.

0

1

Updated 2026-05-14

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L