Concept

Scaling Up via Long Sequence Adaptation

While large-scale training typically involves increasing data and computational resources, scaling up Large Language Models can also occur by adapting them to process significantly longer sequences. For example, a model might be pre-trained on extensive texts of standard length and subsequently applied to handle very long token sequences that greatly exceed the lengths encountered during its pre-training phase.

0

1

Updated 2026-04-22

Contributors are:

Who are from:

Tags

Foundations of Large Language Models

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences