Learn Before
Investigating the Utility of Context Tokens
A central research question regarding long-context language models is whether every token in the input contributes meaningfully to the model's predictions. Design an experiment to test the hypothesis that 'not all tokens in a long context are equally useful for a given prediction.' Your response should clearly outline:
- The specific task the model will perform.
- The methodology you would use to measure the influence of different parts of the context.
- The specific results you would look for to either support or refute the hypothesis.
0
1
Tags
Ch.3 Prompting - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Evaluation in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
LLMs as Powerful In-Context Compressors
Sufficiency of Learned Features for Future Token Prediction
Analysis of Positional Bias in Context Utilization
A research team observes that a large language model's performance on a long-document question-answering task plateaus after the context reaches 16,000 tokens. Even when the correct answer is placed at the 20,000th token, the model frequently fails to retrieve it, performing no better than when the answer is absent. Which of the following hypotheses about long-context utilization is most directly challenged by this finding?
Investigating the Utility of Context Tokens