Learn Before
In a simplified attention mechanism, the final output is a weighted average of content-carrying vectors from the input sequence. Suppose an input sequence has three items with the following content vectors and their corresponding calculated attention weights:
- Item 1: Vector = [2, 10], Weight = 0.2
- Item 2: Vector = [4, 0], Weight = 0.7
- Item 3: Vector = [8, 5], Weight = 0.1
What is the resulting output vector from this attention calculation?
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
In a simplified attention mechanism, the final output is a weighted average of content-carrying vectors from the input sequence. Suppose an input sequence has three items with the following content vectors and their corresponding calculated attention weights:
- Item 1: Vector = [2, 10], Weight = 0.2
- Item 2: Vector = [4, 0], Weight = 0.7
- Item 3: Vector = [8, 5], Weight = 0.1
What is the resulting output vector from this attention calculation?
Distinguishing Vector Roles in Attention
An engineer is debugging a self-attention layer in a language model. They observe that the attention scores (the weights assigned to each input) are being calculated correctly. However, they decide to run an experiment where they keep the Query and Key vectors for each input token exactly the same, but they replace every Value vector with a vector of all zeros. Which of the following outcomes is the most certain result of this specific change?