Definition

Attention Weight Matrix (α)

The attention weight matrix, denoted as α(Q,K)\alpha(\mathbf{Q}, \mathbf{K}), contains the weights that determine the importance of each value vector for a given query. This matrix is derived from the query (Q\mathbf{Q}) and key (K\mathbf{K}) matrices and has dimensions of m×mm \times m, where mm is the number of items in the input sequence.

0

1

Updated 2026-04-22

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences