Learn Before
Definition of RoPE-Encoded Token Embedding Notation (ei)
In Rotary Positional Embeddings (RoPE), the final embedding for a token at position is denoted as . This embedding is the result of applying the RoPE transformation, , to the original token embedding with the positional angle . The parameter set is defined as . The formal definition of the embedding is:
0
1
Tags
Ch.3 Prompting - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Definition of RoPE-Encoded Token Embedding Notation (ei)
A 6-dimensional token embedding vector, represented as
v = [v1, v2, v3, v4, v5, v6], is being prepared for a rotational transformation to encode its position. Which of the following correctly describes how this vector is reinterpreted for the transformation process?Vector Reinterpretation for Rotational Transformation
To prepare a 4-dimensional token embedding vector
v = [1.0, 2.5, -0.5, 4.0]for a rotational transformation, it is first reinterpreted as a 2-dimensional vector of complex numbers. The first complex number in this new vector is1.0 + 2.5i. The second complex number is ____.When applying a rotational transformation to a 128-dimensional token embedding, the transformation applied to the complex number formed by the 1st and 2nd elements is dependent on the transformation applied to the complex number formed by the 3rd and 4th elements.
Formula for Multi-dimensional RoPE in Complex Space
Learn After
In a system that uses a rotational transformation to encode token positions, the final embedding for a token is derived from its original embedding and its position index. Let the original embedding for the token 'apple' be and for 'banana' be . If 'apple' is the second token (position index 2) and 'banana' is the fourth token (position index 4) in a sequence, which of the following correctly represents the final embeddings for 'apple' () and 'banana' ()? Assume the transformation function is and the positional parameter is .
A researcher proposes a modified rotational transformation for encoding token positions, defined as , where is the position and is a parameter. This new embedding is intended to replace the standard formulation , where is the original token embedding. What is the primary conceptual flaw of using as the final token representation in a sequence?
Match each symbol from the rotational position encoding formula, , to its correct description. This formula is used to create a final embedding () by applying a rotational transformation () to an initial token embedding () based on its position ().