Learn Before
When the base b used to calculate frequency parameters in Rotary Positional Embeddings is multiplied by a scaling factor, the periods associated with all dimensions of the embedding are scaled by an identical, uniform amount.
0
1
Tags
Ch.3 Prompting - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A developer is adapting a pre-trained language model that uses rotational position embeddings to handle much longer input sequences. They achieve this by applying a scaling factor to the base
bused in the frequency calculations for the embeddings. Which statement best analyzes the impact of this change on the periods of the rotational frequencies across the different embedding dimensions?When the base
bused to calculate frequency parameters in Rotary Positional Embeddings is multiplied by a scaling factor, the periods associated with all dimensions of the embedding are scaled by an identical, uniform amount.Analysis of Period Scaling in Positional Embeddings