1Cademy - Formula for Relative Position Scaled by Sinusoidal Wavelength

Learn Before

Kerple

Formula for Relative Position Scaled by Sinusoidal Wavelength

This formula calculates a value based on the relative distance (i - j) between two sequence positions. This distance is then scaled by a denominator, $10000^{2k/d}$, which serves as a wavelength term and is a core component of the sinusoidal positional encoding scheme from the original Transformer model. The full formula is: $(i-j)/10000^{2k/d}$ In this expression, k typically represents the dimension index within the embedding, and d is the total dimensionality of the model's embeddings. The resulting value is commonly used as an input to sine and cosine functions to generate a final positional encoding vector or bias.

15 days ago

Contributors are:

Who are from:

References

Reference of Foundations of Large Language Models Course

Learn After

In the formula (i - j) / 10000^(2k/d), used for calculating a scaled relative position, k represents a specific dimension index within a d-dimensional embedding. Analyze the relationship between the dimension index k and the denominator 10000^(2k/d). What is the effect of this relationship on the resulting positional signal?
Debugging a Positional Encoding Implementation
Analysis of Wavelength Variation

Learn Before

Related

Learn After