Learn Before
An engineer is implementing a transformer model with an embedding dimensionality of d = 512. For the positional information, they use a method where frequency parameters θ_k are calculated using the formula: θ_k = 10000^(-2(k-1)/d). What is the correct value for the frequency parameter θ_k where the component index is k = 129?
0
1
Tags
Ch.3 Prompting - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
An engineer is implementing a transformer model with an embedding dimensionality of
d = 512. For the positional information, they use a method where frequency parametersθ_kare calculated using the formula:θ_k = 10000^(-2(k-1)/d). What is the correct value for the frequency parameterθ_kwhere the component index isk = 129?Consider the formula for calculating frequency parameters in a positional embedding scheme:
θ_k = 10000^(-2(k-1)/d), wheredis the embedding dimension andkis the component index. According to this formula, as the component indexkincreases, the value of the frequency parameterθ_kalso increases.Impact of Base Value on Frequency Parameters