Learn Before
The primary functional difference between the prefilling phase in an autoregressive model and the encoding process in a model like BERT is the specific mathematical operations used to create token representations; their approach to incorporating contextual information from the input sequence is fundamentally identical.
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Consider two different methods for creating contextualized numerical representations of words in a sentence. Method 1 generates a representation for each word based only on the words that precede it. Method 2 generates a representation for each word based on all other words in the sentence, both preceding and succeeding it. Which statement accurately compares these two methods to the processes found in large-scale language models?
Directionality in Contextual Representations
The primary functional difference between the prefilling phase in an autoregressive model and the encoding process in a model like BERT is the specific mathematical operations used to create token representations; their approach to incorporating contextual information from the input sequence is fundamentally identical.