Short Answer

Equivalence of RoPE Modification Strategies

An LLM developer is experimenting with a new way to handle positional information. They have two potential implementation strategies:

Strategy A: Create a completely new function, Ro_new, which takes a token embedding and a position index. Internally, this function first doubles the position index and then applies the standard rotational transformation.

Strategy B: Use the original, unmodified RoPE function, Ro. However, before passing the position index to this function, they preprocess it by doubling its value.

Based on the principle of general equivalence for modified rotary embeddings, are these two strategies functionally identical? Explain your reasoning.

0

1

Updated 2025-10-04

Contributors are:

Who are from:

Tags

Ch.3 Prompting - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science