Short Answer

Dimensional Analysis of the Attention Formula

An attention mechanism operates on a Query matrix Q\textbf{Q} with dimensions $10 \times 64, a Key matrix\textbf{K}with dimensions \20 \times 64,andaValuematrix, and a Value matrix \textbf{V} with dimensions $20 \times 128. According to the general formulaAtt(\textbf{Q}, \textbf{K}, \textbf{V}) = \alpha(\textbf{Q}, \textbf{K})\textbf{V}, what will be the dimensions of the final output matrix? Explain the steps to arrive at your answer.

0

1

Updated 2025-10-04

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Ch.5 Inference - Foundations of Large Language Models

Application in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science