Analyzing a Language Model's Training Process
Based on the training instance described below, is the engineer's conclusion that there is an error correct? Justify your answer by explaining the training principle at play.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
During a language model's training, a specific token is chosen from an input sequence to be predicted. In a small percentage of cases, the training strategy requires this chosen token to be left as-is, without being replaced. Consider the original sequence:
[CLS] The quick brown fox jumps . [SEP]. If the token 'fox' is selected for prediction but falls under the rule where it remains unchanged, what is the final input sequence fed to the model for this training step?Analyzing a Language Model's Training Process
Language Model Training Task