1Cademy - An engineer is adapting a language model to process sequences twice as long as its original design (i.e., `m = 2 * m_l`). They use a method where the period of the highest frequency component in the new model is set equal to that of a linearly scaled model. This relationship is captured by the equation: $$2\pi \cdot (\lambda b)^{\frac{d-2}{d}} = \frac{m}{m_l} \cdot 2\pi \cdot b^{\frac{d-2}{d}}$$ Given that the embedding dimensionality `d` is greater than 2 and the original base `b` is a positive constant, how must the scaling factor `λ` change to satisfy this constraint for the new, longer sequence length?

Learn Before

Equation for Matching Periods in RoPE Base Scaling

Multiple Choice

An engineer is adapting a language model to process sequences twice as long as its original design (i.e., m = 2 * m_l). They use a method where the period of the highest frequency component in the new model is set equal to that of a linearly scaled model. This relationship is captured by the equation: $2\pi \cdot (\lambda b)^{\frac{d-2}{d}} = \frac{m}{m_l} \cdot 2\pi \cdot b^{\frac{d-2}{d}}$ Given that the embedding dimensionality d is greater than 2 and the original base b is a positive constant, how must the scaling factor λ change to satisfy this constraint for the new, longer sequence length?

0

1

Updated 2025-10-03

Contributors are:

Who are from:

Learn Before

Related