Concept

Generalization Issues of Learnable Positional Embeddings

While learned positional embeddings perform well when training and inference sequences have similar lengths, they face a significant practical challenge. To manage computational costs, models are typically trained on sequences with a fixed maximum length. This creates a generalization issue during inference when the model must process sequences longer than any it encountered during training, as it lacks pre-trained embeddings for these unseen positions.

0

1

Updated 2026-04-23

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences