Learn Before
A transformer model's architecture specifies that its token embeddings have a dimensionality of 768. This model uses a rotational method to encode positional information, where the transformations are governed by a specific parameter vector. Based on this information, how many components does this parameter vector contain?
0
1
Tags
Ch.3 Prompting - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A transformer model's architecture specifies that its token embeddings have a dimensionality of 768. This model uses a rotational method to encode positional information, where the transformations are governed by a specific parameter vector. Based on this information, how many components does this parameter vector contain?
Consequences of Modifying RoPE Parameters
In a model that uses rotational transformations to encode positional information, if the token embeddings have a dimensionality of
d, the parameter vectorθthat governs these transformations will also havedcomponents.