A key characteristic of arbitrary order prediction is that, at any given step, the task of predicting the next token is functionally identical to a standard masked language modeling task because both utilize bidirectional context.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A language model is generating a sequence of tokens. It has already determined the tokens at original positions 1, 2, and 5, and is now in the process of predicting the token for original position 3. This specific prediction step is analogous to a masked language modeling task. Which statement best analyzes the reason for this analogy?
Analyzing Prediction Context in Arbitrary Order Generation
A key characteristic of arbitrary order prediction is that, at any given step, the task of predicting the next token is functionally identical to a standard masked language modeling task because both utilize bidirectional context.