Learn Before
Memory State as an Average of Keys and Values
The memory state, denoted as Mem, is represented as a tuple. The first element of the tuple is the arithmetic mean of key vectors () and the second element is the arithmetic mean of value vectors (), both summed from index to . The formula is: This calculation effectively summarizes the information contained in the sequence of key-value pairs up to step .

0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Linear Algebra with Applications
Linear Algebra - Matrices
Transpose
Matrix Multiplication
Moore-Penrose Pseudoinverse
Using the Moore-Penrose Pseudoinverse to Solve Linear Equations
Linear Algebra (Trace)
Linear Algebra (Determinant)
Linear Algebra - Diagonal Matrices
Linear Algebra - Unit Vector
Linear Algebra - orthogonal
Linear Algebra - orthonormal
Linear Algebra - orthogonal matrix
Linear Algebra - eigenvector
Linear Algebra - eigenvalue
Linear Algebra - eigendecomposition
Singular value decomposition (SVD)
Linear Algebra - Dot Product and Multiplication Rules
Linear Algebra - Identity and Inverse Matrices
Linear dependence and span
Linear Algebra - Norm
Standard Basis Vector
Notation for a Tuple of Identical Elements
Memory State as an Average of Keys and Values
Notation for a Sequence of Variables
Tensor
Matrix
Element-wise Product
Broadcasting Mechanism
Vector
Scalars
Symmetric Matrix
Learn After
Consider a sequence of three key-value pairs. The key vectors are k₀ = [1, 2], k₁ = [2, 3], and k₂ = [3, 1]. The corresponding value vectors are v₀ = [3, 0], v₁ = [1, 5], and v₂ = [2, 1]. If a memory state is defined as a tuple containing the arithmetic mean of the key vectors and the arithmetic mean of the value vectors, what is the resulting memory state after processing all three pairs?
Deriving a Missing Input from a Memory State
Evaluating a Simple Memory Mechanism