Diagnosing RoPE Implementation Anomaly
Given the formula for the period of the sine and cosine components, T_k = 2π * b^(2(k-1)/d), what is the most plausible error in an engineer's implementation that would cause the period T_k to consistently decrease as the component index k increases?
0
1
Tags
Ch.3 Prompting - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A language model architect is adjusting the dimensionality
dof a model's embeddings, which affects the periods of the sine and cosine functions used for positional information. The periodT_kfor the k-th component is calculated using the formula:T_k = 2π * b^(2(k-1)/d), wherebis a fixed base greater than 1. If the architect increases the dimensionalitydwhile keeping the component indexk(wherek > 1) and basebconstant, how will the periodT_kbe affected?Calculating RoPE Component Period
Diagnosing RoPE Implementation Anomaly