1Cademy - A large language model, trained exclusively on text sequences with a maximum length of 1024 tokens, is later used to process a 3000-token document. The models positional encoding system simply continues its established pattern to assign unique positions for all tokens up to 3000. Observers note a significant drop in performance, especially in tasks requiring an understanding of relationships between distant parts of the text. Which statement best analyzes this performance issue?

Learn Before

Extrapolation and Interpolation Methods for Positional Embeddings

Multiple Choice

A large language model, trained exclusively on text sequences with a maximum length of 1024 tokens, is later used to process a 3000-token document. The model's positional encoding system simply continues its established pattern to assign unique positions for all tokens up to 3000. Observers note a significant drop in performance, especially in tasks requiring an understanding of relationships between distant parts of the text. Which statement best analyzes this performance issue?

Updated 2025-10-06

Contributors are:

Who are from:

Learn Before

Related