Learn Before
A team of engineers is adapting a pre-trained language model to handle much longer text sequences. They decide to use a method that involves scaling the base value used in the model's rotational position embeddings. To select the appropriate scaling factor, they must adhere to a specific guiding principle. Which of the following best describes this principle?
0
1
Tags
Ch.3 Prompting - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Period Matching Equation for RoPE Base Scaling
Equation for Matching Periods in RoPE Base Scaling
A team of engineers is adapting a pre-trained language model to handle much longer text sequences. They decide to use a method that involves scaling the base value used in the model's rotational position embeddings. To select the appropriate scaling factor, they must adhere to a specific guiding principle. Which of the following best describes this principle?
Selecting a RoPE Base Scaling Factor
When adapting a language model for longer sequences by scaling its rotational position embedding base, the guiding constraint is to match the period of the lowest frequency dimension in the scaled model to the period of a model using linear interpolation.