Dot Product of RoPE-Encoded Vectors as a Function of Relative Position
When analyzing Rotary Positional Embeddings in 2D Euclidean space, the dot product of two rotated vectors, Ro(x, tθ) and Ro(y, sθ), is shown to be a function of the relative position term (t − s)θ. This demonstrates that, similar to the inner product in complex space, the dot product in Euclidean space also inherently models the positional offset between tokens.

0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Dot Product of RoPE-Encoded Vectors as a Function of Relative Position
In a model where token positions are encoded by rotating their vector representations, the inner product is calculated between the transformed representations of token 'A' and token 'B'. In Scenario 1, token 'A' is at position 5 and token 'B' is at position 8. In Scenario 2, the same tokens 'A' and 'B' are at positions 12 and 15, respectively. Based on the fundamental property of this encoding method, what is the expected relationship between the inner product value from Scenario 1 and the value from Scenario 2?
Analysis of Rotational Embedding Properties
Formula for the Inner Product of RoPE-Encoded Tokens in Complex Space
Explaining Positional Invariance in Rotational Embeddings
Learn After
Derivation of the Dot Product for RoPE-Encoded Vectors
Implicit Relative Position Modeling in Self-Attention with RoPE
A language model uses a positional encoding scheme with a specific mathematical property: the dot product between the encoded representations of any two tokens is a function solely of the difference between their positions in the sequence. Which of the following statements most accurately analyzes the primary advantage of this property for processing language?
In a system that encodes token positions by rotating their vector representations, the dot product between the encoded vector for a token at position
tand another at positionsis found to be dependent only on their relative displacement(t-s). Based on this property, the dot product calculated for a pair of tokens at positions 5 and 8 would be identical to the dot product for the same pair of tokens if they were located at positions 15 and 18.Diagnosing a Positional Encoding Flaw