Case Study

Impact of Embedding Dimensionality on RoPE Scaling

An AI research team is extending the context window for two different language models, Model A and Model B. Both models need to be scaled from an original length of 2048 tokens to a new length of 8192 tokens. The only difference between them is their embedding dimensionality (d):

  • Model A: d = 64
  • Model B: d = 512

The team will use the formula below to calculate the required scaling factor (λ):

λ=(mml)dd2\lambda = \left(\frac{m}{m_l}\right)^{\frac{d}{d-2}}

Without performing the full calculation, predict which model will require a larger scaling factor (λ). Justify your reasoning by analyzing how the embedding dimensionality (d) influences the exponent in the formula.

0

1

Updated 2025-10-03

Contributors are:

Who are from:

Tags

Ch.3 Prompting - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science