1Cademy - Attention Weight Matrix (α)

Learn Before

General Attention Formula

Definition

Attention Weight Matrix (α)

The attention weight matrix, denoted as $\alpha(\mathbf{Q}, \mathbf{K})$ , contains the weights that determine the importance of each value vector for a given query. This matrix is derived from the query ( $\mathbf{Q}$ ) and key ( $\mathbf{K}$ ) matrices and has dimensions of $m \times m$ , where $m$ is the number of items in the input sequence.