Learn Before
Uniqueness of RoPE-based Embeddings
A language model generates a final, position-aware embedding, , by applying a rotational transformation to a token's initial embedding, , based on its position, . The process is described by the function . If two different tokens (with distinct initial embeddings and ) are located at the same position , is it possible for them to have identical final embeddings (i.e., )? Explain your reasoning based on the properties of a rotational transformation.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Ch.3 Prompting - Foundations of Large Language Models
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Application of RoPE Rotation to a 2D Vector
RoPE Frequency Parameters
Definition of the 2x2 RoPE Rotation Matrix Block
RoPE Parameter Vector Definition
Definition of RoPE Parameter Vector (θ)
A language model encodes token positions by applying a unique, position-dependent rotational transformation to each token's initial embedding. The final, position-aware embedding for a token is the result of this transformation. If the exact same token (e.g., 'model') appears at position 4 and later at position 12 in a sequence, which statement best describes the relationship between their final embeddings, and ?
RoPE 2D Vector Rotation Formula
Formula for RoPE-Encoded Token Embedding
Uniqueness of RoPE-based Embeddings
Debugging a RoPE Implementation