Multiple Choice

In a general attention mechanism, the output is calculated as a weighted sum of the Value vectors, where the weights are determined by the interaction between Query and Key vectors. The standard formula is: Att(Q,K,V)=α(Q,K)VAtt(\textbf{Q}, \textbf{K}, \textbf{V}) = \alpha(\textbf{Q}, \textbf{K})\textbf{V}. Consider a scenario where this formula is mistakenly altered to be: Attmodified=α(Q,K)KAtt_{modified} = \alpha(\textbf{Q}, \textbf{K})\textbf{K}. What is the most significant consequence of this modification?

0

1

Updated 2025-10-02

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Ch.5 Inference - Foundations of Large Language Models

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science