1Cademy - A language model is trained on sequences up to a maximum length of `L`. During inference, it encounters a sequence of length `2L`. Match each strategy for handling the unseen positions (`L` to `2L-1`) with its corresponding classification.

Learn Before

Classification of Generalization Approaches for Positional Embeddings

Matching

A language model is trained on sequences up to a maximum length of L. During inference, it encounters a sequence of length 2L. Match each strategy for handling the unseen positions (L to 2L-1) with its corresponding classification.

Updated 2025-10-10

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science

Sinusoidal Positional Encoding
Extrapolation and Interpolation Methods for Positional Embeddings
Example of Extrapolation in Sequence Models
Comparison of Generalizing vs. Non-Generalizing Positional Encodings
Example of Interpolation in Sequence Models
A language model was trained exclusively on text sequences with a maximum length of 1024 tokens. When presented with a 2048-token sequence, two different approaches are considered for generating positional information for the new, unseen positions (1024 to 2047).

Approach X: The mechanism generates values for the new positions by continuing the mathematical pattern it learned from the original 0-1023 positions.

Approach Y: The mechanism rescales the positional indices of the entire 204
Choosing a Positional Embedding Generalization Strategy
A language model is trained on sequences up to a maximum length of L. During inference, it encounters a sequence of length 2L. Match each strategy for handling the unseen positions (L to 2L-1) with its corresponding classification.

Learn Before

Related