Short Answer

Dimensional Analysis of the Attention Formula

An attention mechanism operates on a Query matrix Q\textbf{Q} with dimensions $10 \times 64,aKeymatrix, a Key matrix \textbf{K} with dimensions $20 \times 64, and a Value matrix V\textbf{V} with dimensions $20 \times 128.Accordingtothegeneralformula. According to the general formula Att(\textbf{Q}, \textbf{K}, \textbf{V}) = \alpha(\textbf{Q}, \textbf{K})\textbf{V}$, what will be the dimensions of the final output matrix? Explain the steps to arrive at your answer.

0

1

Updated 2025-10-04

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Ch.5 Inference - Foundations of Large Language Models

Application in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science