1Cademy - A language model is being trained to predict the next word in a sequence. The training process aims to minimize a loss value, which measures the difference between the models predicted probability distribution for the next word and the actual correct word. Consider two separate predictions for the next word after the phrase The sun is shining...:<br><br>- **Prediction A:** The model assigns a probability of 0.75 to the correct word, brightly.<br>- **Prediction B:** The model assigns a probability of

Learn Before

Loss Function for Language Modeling

Multiple Choice

A language model is being trained to predict the next word in a sequence. The training process aims to minimize a loss value, which measures the difference between the model's predicted probability distribution for the next word and the actual correct word. Consider two separate predictions for the next word after the phrase 'The sun is shining...':

Prediction A: The model assigns a probability of 0.75 to the correct word, 'brightly'.
Prediction B: The model assigns a probability of

Updated 2025-09-26

Contributors are:

Who are from:

Learn Before

Related