Learn Before
Application of RoPE to d-dimensional Embeddings
To apply Rotary Positional Embeddings (RoPE) to a d-dimensional token embedding, the vector is reinterpreted as a complex vector with d/2 components. This is achieved by grouping consecutive pairs of elements from the original vector, where each pair forms a complex number. The rotational transformation is then applied to each of these d/2 complex numbers independently.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Comparison of Rotary and Sinusoidal Embeddings
Conceptual Illustration of RoPE's Rotational Mechanism
Example of RoPE Capturing Relative Positional Information
Application of RoPE to d-dimensional Embeddings
Application of RoPE to Token Embeddings
RoPE as a Linear Combination of Periodic Functions
Consider two distinct methods for encoding a token's position within a sequence. Method A calculates a unique positional vector and adds it to the token's embedding. Method B applies a rotational transformation to the token's embedding, with the angle of rotation determined by the token's position. Based on these descriptions, which statement best analyzes a fundamental difference in how these two methods integrate positional context?
Positional Information in Vector Transformations
Analyzing Relative Positional Information
Selecting a Positional Strategy for a Long-Context Retrofit
Diagnosing Long-Context Failures Across Positional Schemes
Choosing and Justifying a Positional Retrofit Under Long-Context and Latency Constraints
Long-Context Retrofit Decision: RoPE Base Scaling vs ALiBi vs T5 Relative Bias
Post-Retrofit Regression: Separating Positional-Method Effects from Scaling Choices
Root-Cause Analysis of Long-Context Degradation After a Positional-Encoding Retrofit
You are reviewing a proposal to extend a productio...
You’re reviewing three proposed positional mechani...
Your team is extending a pretrained Transformer fr...
You’re debugging a long-context retrofit of a pret...
Advantage of Rotary over Sinusoidal Embeddings for Long Sequences
Formula for Multiplicative Positional Embeddings
Angle Preservation in Rotary Embeddings
Learn After
Definition of RoPE-Encoded Token Embedding Notation (ei)
A 6-dimensional token embedding vector, represented as
v = [v1, v2, v3, v4, v5, v6], is being prepared for a rotational transformation to encode its position. Which of the following correctly describes how this vector is reinterpreted for the transformation process?Vector Reinterpretation for Rotational Transformation
To prepare a 4-dimensional token embedding vector
v = [1.0, 2.5, -0.5, 4.0]for a rotational transformation, it is first reinterpreted as a 2-dimensional vector of complex numbers. The first complex number in this new vector is1.0 + 2.5i. The second complex number is ____.When applying a rotational transformation to a 128-dimensional token embedding, the transformation applied to the complex number formed by the 1st and 2nd elements is dependent on the transformation applied to the complex number formed by the 3rd and 4th elements.
Formula for Multi-dimensional RoPE in Complex Space