A researcher proposes a modified rotational transformation for encoding token positions, defined as , where is the position and is a parameter. This new embedding is intended to replace the standard formulation , where is the original token embedding. What is the primary conceptual flaw of using as the final token representation in a sequence?
0
1
Tags
Ch.3 Prompting - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
In a system that uses a rotational transformation to encode token positions, the final embedding for a token is derived from its original embedding and its position index. Let the original embedding for the token 'apple' be and for 'banana' be . If 'apple' is the second token (position index 2) and 'banana' is the fourth token (position index 4) in a sequence, which of the following correctly represents the final embeddings for 'apple' () and 'banana' ()? Assume the transformation function is and the positional parameter is .
A researcher proposes a modified rotational transformation for encoding token positions, defined as , where is the position and is a parameter. This new embedding is intended to replace the standard formulation , where is the original token embedding. What is the primary conceptual flaw of using as the final token representation in a sequence?
Match each symbol from the rotational position encoding formula, , to its correct description. This formula is used to create a final embedding () by applying a rotational transformation () to an initial token embedding () based on its position ().