1Cademy - An AI development team trains a language model exclusively on documents with a maximum length of 4,096 tokens. After deployment, they are surprised to find that the model can coherently summarize documents up to 5,000 tokens long, but its performance degrades significantly on documents longer than 6,000 tokens. Which statement best analyzes this observation?

Learn Before

Length Extrapolation in LLMs

Multiple Choice

An AI development team trains a language model exclusively on documents with a maximum length of 4,096 tokens. After deployment, they are surprised to find that the model can coherently summarize documents up to 5,000 tokens long, but its performance degrades significantly on documents longer than 6,000 tokens. Which statement best analyzes this observation?

Updated 2025-10-03

Contributors are:

Who are from:

Learn Before

Related