Learn Before
Visualizing Positional Embedding Generalization
Positional embedding methods vary in how they handle sequence lengths that exceed their training data. In a visual representation of these methods across a range of positions, positions observed during training (e.g., blue points) can be distinguished from newly observed positions at test time (e.g., red points). An encoding model that strictly memorizes the points seen during training cannot generalize to new positions outside that domain. However, models designed to generalize can successfully process newly observed positions through mechanisms such as extrapolation and interpolation.
0
1
Tags
Foundations of Large Language Models
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences