1Cademy - Fine-tuning on Longer Sequences for Enhanced Length Extrapolation

Learn Before

Length Extrapolation in LLMs

Concept

Fine-tuning on Longer Sequences for Enhanced Length Extrapolation

A targeted and effective method for improving a pre-trained LLM's length extrapolation capabilities is to fine-tune it on a dataset of sequences that are longer than those used in its initial training phase.

Updated 2025-10-06

Contributors are:

Who are from:

References

Reference of Foundations of Large Language Models Course

Learn After

A research team has a language model that was pre-trained exclusively on text segments with a maximum length of 2,048 tokens. The team's goal is to adapt this model to accurately summarize legal documents that are frequently 5,000 tokens long, a task at which the model currently performs poorly. Given this specific goal, which of the following fine-tuning strategies is most likely to be effective?
Diagnosing Fine-Tuning Failure for Long Contexts
Designing a Fine-Tuning Strategy for Long-Context Tasks

Learn Before

Related

Learn After