Learn Before
Multiple Choice

A causal model is calculating the output for the token at position i=3. The model's attention mechanism is optimized to only consider a subset of previous positions. The set of contributing indices is G = {0, 2}. The attention weights for these indices are α_3,0 = 0.6 and α_3,2 = 0.4. The value vectors for the relevant positions are: v_0 = [1, 0], v_1 = [2, 2], and v_2 = [0, 3]. Based on this information, what is the final output vector for position 3?

0

1

Updated 2025-09-26

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Application in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science