Learn Before
Value Matrix from a Sliding Window
The notation represents a matrix created by vertically stacking a sequence of value vectors within a sliding window. This matrix contains the value vectors from index to the current index , capturing the most recent values. It is formally defined as:

0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Single-Query Attention Computation with Multiplicative Scaling
Scaled Dot-Product Attention
General Attention Formula
Value Matrix for Causal Attention (V_≤i)
Value Matrix from a Sliding Window
An attention mechanism processes an input sequence of 20 tokens, where each token is represented by a 256-dimensional vector. A Value matrix (V) is generated as part of this process. Which of the following statements most accurately describes the properties and role of this V matrix?
Determining Value Matrix Dimensions
Debugging an Attention Mechanism
Key Matrix from a Sliding Window
Value Matrix from a Sliding Window
An engineer is optimizing a language model that processes long documents using an attention mechanism that considers a fixed-size window of the most recent tokens. If the engineer decides to significantly increase the size of this window, what is the primary trade-off they will encounter?
Determining the Context Window
Diagnosing Long-Range Dependency Failures
Learn After
Formula for Fixed-Size Window Memory
Consider a sequence of 2-dimensional value vectors, where:
- v_1 = [10, 11]
- v_2 = [12, 13]
- v_3 = [14, 15]
- v_4 = [16, 17]
- v_5 = [18, 19]
Given a current processing index of i = 5 and a context window size of nc = 3, which matrix below correctly represents the structure formed by vertically stacking the value vectors from the corresponding sliding window, from index i - nc + 1 to i?
Inferring Window Parameters
Determining Matrix Dimensions from a Sliding Window