Learn Before
Sufficiency of Learned Features for Future Token Prediction
An area of investigation in long-context language modeling is whether the features learned by a model up to a given point are sufficient for predicting subsequent tokens. This research explores the efficiency and foresight of the model's internal representations.
0
1
Tags
Ch.3 Prompting - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Ch.2 Generative Models - Foundations of Large Language Models
Related
LLMs as Powerful In-Context Compressors
Sufficiency of Learned Features for Future Token Prediction
Analysis of Positional Bias in Context Utilization
A research team observes that a large language model's performance on a long-document question-answering task plateaus after the context reaches 16,000 tokens. Even when the correct answer is placed at the 20,000th token, the model frequently fails to retrieve it, performing no better than when the answer is absent. Which of the following hypotheses about long-context utilization is most directly challenged by this finding?
Investigating the Utility of Context Tokens
Learn After
An experiment is conducted on a large language model. The model processes the first half of a novel, and its internal state (the set of learned features) at the halfway point is saved. A separate, simple predictive tool is then trained using only this saved internal state. The tool's task is to predict a major plot twist that occurs in the final chapter of the novel. The tool achieves a surprisingly high accuracy. What does this outcome most strongly imply about the model's processing?
Interpreting LLM Feature Sufficiency Experiment
Comparative Analysis of LLM Feature Learning Strategies