Evaluating a Positional Encoding Implementation
An engineer is analyzing a language model's behavior. They observe that the computed relationship between the token for 'deep' and the token for 'learning' is different when processing the phrase 'a course on deep learning' compared to when processing the phrase 'advances in deep learning for science'. In both phrases, the word 'learning' immediately follows 'deep'.
Based on this observation, evaluate whether the model's positional encoding scheme is correctly implementing the principle that the inner product between two rotationally-encoded tokens should depend solely on their relative offset. Justify your conclusion.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Evaluation in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Consider two token embeddings, x and y, encoded with a rotational position scheme at positions 10 and 7, respectively. Their resulting inner product is calculated. If these same two tokens are instead placed at positions 25 and 22, how would the new inner product compare to the original one?
Invariance in Rotational Position Encodings
Evaluating a Positional Encoding Implementation