Learn Before
Function of a Sequence of Averaged Vectors
The expression represents a function, denoted by , that takes a sequence of averaged vectors as its input. The sequence consists of terms from the first vector, , up to the -th vector, .

0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Set of Indexed Key-Value Pairs
Set of Superscript-Indexed Vectors
Set of Key-Value Pairs
Function of a Sequence of Overlined Variables
Function of a Sequence of Averaged Vectors
Vector Slice Notation for a Sequence Window ()
Set of Sequential Vectors Notation
Vector Sequence Window Notation
Consider an autoregressive model generating a sequence of tokens one by one. At each step
i, the model calculates attention using the query from the current token and the keys and values from all tokens generated so far (from position 1 toi). To optimize this process, the model maintains a growing set of all previously computed key and value vectors. What is the primary computational advantage of this strategy?State of an Autoregressive Cache
An autoregressive language model with
τparallel computational units (e.g., attention heads) is generating a sequence of tokens. After computing the output for the 3rd token, the model stores the key and value vectors from all tokens processed so far to use in subsequent steps. Which of the following notations correctly represents the complete set of these stored key-value pairs at this specific moment?
Learn After
Input for Next-Token Prediction
Consider the expression , where 's' is a function that operates on a sequence of averaged vectors. What does this notation imply about the information used to compute an outcome for a given step 'k'?
Consider a process where the outcome at step
kis determined by a functions(ȳ₁...ȳₖ₋₁). This function takes a sequence of averaged vectors from step 1 tok-1as input. Based on this definition, the outcome at step 10 is completely independent of the vectorȳ₂.