Identifying Language Modeling Approach
An AI research team is training a language model. During a specific training step, the model's objective is to predict the token at position i in a sequence. To do this, the model is only given access to the tokens from the beginning of the sequence up to position i-1. Information from any tokens at or after position i is deliberately withheld for this prediction. Based on this description of the training objective, which language modeling approach is being implemented? Justify your answer by explaining how the model's access to contextual information defines the approach.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A language model is being developed specifically for a task that involves generating long, coherent passages of text, such as writing a story from an initial prompt. The model must generate the text sequentially, predicting each new word based only on the words that came before it. Which training approach is inherently structured for this type of task, and what is the key reason?
Identifying Language Modeling Approach
Match each characteristic to the language modeling approach it describes. The two approaches are 'Causal Language Modeling' and 'General Masked Language Modeling'.