Diagnosing an Error in Positional Encoding Scaling
Based on the case study below, analyze the primary consequence of the engineer's error on the model's positional encodings when processing the new, longer sequences.
0
1
Tags
Ch.3 Prompting - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A language model was originally trained with a maximum sequence length of 2048 tokens. To handle longer documents, its positional encodings are being adjusted to accommodate a new sequence length of 8192 tokens by scaling the periods of the encoding functions. If the original period for a specific dimension was 100, what is the new, adjusted period for that same dimension after this adjustment?
Analyzing Period Scaling Effects
Diagnosing an Error in Positional Encoding Scaling