Learn Before
Example
Identity Matrix as Attention Weights Visualization
As a fundamental sanity check for attention visualizations, an identity matrix can be used to represent the attention weights. In this scenario, the attention weight is exactly only when the query and the key correspond to the same index, representing a perfect one-to-one focus. This idealized case serves to verify that the visualization mechanism correctly maps the query-key pairs to the expected heatmap layout.
0
1
Updated 2026-05-14
Tags
D2L
Dive into Deep Learning @ D2L