Learn Before
A language model is being trained to predict the next word in a sequence. The training process aims to minimize a loss value, which measures the difference between the model's predicted probability distribution for the next word and the actual correct word. Consider two separate predictions for the next word after the phrase 'The sun is shining...':
- Prediction A: The model assigns a probability of 0.75 to the correct word, 'brightly'.
- Prediction B: The model assigns a probability of 0.15 to the correct word, 'brightly'.
Which of the following statements accurately analyzes the loss values for these two predictions?
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A language model is being trained to predict the next word in a sequence. The training process aims to minimize a loss value, which measures the difference between the model's predicted probability distribution for the next word and the actual correct word. Consider two separate predictions for the next word after the phrase 'The sun is shining...':
- Prediction A: The model assigns a probability of 0.75 to the correct word, 'brightly'.
- Prediction B: The model assigns a probability of 0.15 to the correct word, 'brightly'.
Which of the following statements accurately analyzes the loss values for these two predictions?
Total Loss Calculation for a Token Sequence
Evaluating Model Prediction Quality
Defining the Ground Truth Distribution