Learn Before
Relative Positional Representations
These representations focus on positional relationships between tokens instead of positions of individual tokens. This works because pairwise positional relationships between input elements (direction and distance) can often be more beneficial than positions of element.
0
1
Tags
Data Science
Foundations of Large Language Models Course
Computing Sciences
Foundations of Large Language Models
Ch.2 Generative Models - Foundations of Large Language Models
Learn After
A language model is trained exclusively on text sequences with a maximum length of 512 tokens. During evaluation, the model shows a significant drop in performance when processing documents that are 1000 tokens long. The engineers hypothesize the problem is related to how the model incorporates word order information. Which of the following changes to the model's architecture is most likely to resolve this specific issue?
Positional Encoding for Machine Translation
Positional Invariance in Self-Attention
Mechanism of Relative Positional Embedding