Learn Before
In the implementation of Rotary Positional Embeddings (RoPE), the vector of frequency parameters, θ = [θ₁, ..., θ_{d/2}], ensures that for any given token position, all pairs of dimensions within its embedding are rotated by the same amount.
0
1
Tags
Ch.3 Prompting - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Definition of the 2x2 RoPE Rotation Matrix Block
Calculation of RoPE Frequency Parameters
Exponential Form of RoPE Frequency Parameters
In a transformer model using Rotary Positional Embeddings, the transformation for each token depends on its position and a vector of frequency parameters,
θ = [θ₁, ..., θ_{d/2}], where each componentθ_kcorresponds to a different 2-dimensional rotation. A researcher proposes a modification where all components of this vector are set to the same value (i.e.,θ₁ = θ₂ = ... = θ_{d/2}). What is the most likely consequence of this change on the model's ability to represent positional information?Effect of Modifying RoPE Frequency Parameters
In the implementation of Rotary Positional Embeddings (RoPE), the vector of frequency parameters,
θ = [θ₁, ..., θ_{d/2}], ensures that for any given token position, all pairs of dimensions within its embedding are rotated by the same amount.