Short Answer

Analyzing Period Scaling Effects

A language model developer is adapting a model originally trained with a maximum sequence length of 4096 tokens to now work with a maximum sequence length of only 2048 tokens. They use the standard period scaling formula for position interpolation: Tk=mmlTkT'_k = \frac{m}{m_l} \cdot T_k, where mm is the new sequence length and mlm_l is the original maximum length. Analyze the effect of this change on the periods of the positional encoding functions. Will the periods increase or decrease, and by what factor? Explain your reasoning based on the formula.

0

1

Updated 2025-10-04

Contributors are:

Who are from:

Tags

Ch.3 Prompting - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science