Learn Before
Analyzing Model Performance on Unseen Sequence Lengths
A team of developers trains a sequence model to predict the next number in a series. The model is trained exclusively on sequences that follow a simple increasing pattern, with a maximum length of 500 tokens. During testing, the model performs with near-perfect accuracy on new sequences up to 500 tokens long. However, when tested on a sequence of 750 tokens that follows the same increasing pattern, the model's predictions become erratic and incorrect after the 500th token. Based on this scenario, diagnose the specific type of generalization failure the model is exhibiting and explain why its performance degraded.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A sequence model is trained to generate numerical sequences. All training examples consist of sequences that follow the simple arithmetic rule: the value at any position is twice the position number (e.g., at position 10, the value is 20). The model is only trained on sequences with a maximum length of 100 positions. After training, the model is evaluated. Which of the following evaluation results provides the strongest evidence that the model can successfully generalize its learned pattern to inputs outside the range of its training data?
Analyzing Model Performance on Unseen Sequence Lengths
A sequence model was trained to replicate a repeating numerical pattern, but only on sequences with a maximum length of 500 steps. The following descriptions outline the model's performance under different testing conditions. Match each performance description to the most appropriate term.