1Cademy - In a general attention mechanism, the output is calculated as a weighted sum of the Value vectors, where the weights are determined by the interaction between Query and Key vectors. The standard formula is: $Att(\textbf{Q}, \textbf{K}, \textbf{V}) = \alpha(\textbf{Q}, \textbf{K})\textbf{V}$. Consider a scenario where this formula is mistakenly altered to be: $Att_{modified} = \alpha(\textbf{Q}, \textbf{K})\textbf{K}$. What is the most significant consequence of this modification?

Learn Before

General Attention Formula

Multiple Choice

In a general attention mechanism, the output is calculated as a weighted sum of the Value vectors, where the weights are determined by the interaction between Query and Key vectors. The standard formula is: $Att(\textbf{Q}, \textbf{K}, \textbf{V}) = \alpha(\textbf{Q}, \textbf{K})\textbf{V}$ . Consider a scenario where this formula is mistakenly altered to be: $Att_{modified} = \alpha(\textbf{Q}, \textbf{K})\textbf{K}$ . What is the most significant consequence of this modification?

Updated 2025-10-02

Contributors are:

Who are from:

Learn Before

Related