Modeling Relative Position Offset via RoPE's Inner Product
The inner product between two RoPE-encoded tokens at positions and is a function of the term . This term directly corresponds to the relative offset between the tokens, demonstrating how RoPE's formulation inherently captures relative positional information.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Modeling Relative Position Offset via RoPE's Inner Product
The inner product of two token embeddings,
xandy, at positionstandsrespectively, is calculated after a rotational transformation using the formula:⟨C(x, tθ), C(y, sθ)⟩ = (x'ȳ')e^(i(t-s)θ). In this formula,x'andȳ'are complex number representations of the original embeddings. If both tokens are shifted by a constant amountkto new positionst+kands+k, how does the inner product change?Deconstructing the RoPE Inner Product Formula
The formula for the inner product of two RoPE-encoded tokens is given by
⟨C(x, tθ), C(y, sθ)⟩ = (x'ȳ')e^(i(t-s)θ). Match each component of this formula to its correct description, analyzing its specific role in the overall calculation.
Learn After
Consider two token embeddings, x and y, encoded with a rotational position scheme at positions 10 and 7, respectively. Their resulting inner product is calculated. If these same two tokens are instead placed at positions 25 and 22, how would the new inner product compare to the original one?
Invariance in Rotational Position Encodings
Evaluating a Positional Encoding Implementation