Debugging Context Length Extension
Given the case study below, analyze the developer's flawed approach. Identify the fundamental error in their implementation and describe the correct method for applying the scaling of position information.
0
1
Tags
Ch.3 Prompting - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Debugging Context Length Extension
A large language model was originally trained with a maximum context window of 2048 tokens. You are now tasked with enabling it to process a sequence of 4096 tokens using a technique that scales the position indices of the longer sequence to fit within the model's original learned range. How should the position index for the token at position 3072 in the 4096-token sequence be handled before being passed to the embedding layer?
Implementing Position Scaling in a Language Model