Interpreting a Model's Training Step
Analyze the following training scenario for a language model and explain the next step in the optimization process.
0
1
Tags
Ch.3 Prompting - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A language model is being trained to predict the next word in a sentence. For the input context 'The sun is shining...', the ideal (target) probability distribution, denoted as , gives a high probability to the word 'brightly'. The model's performance is measured by a loss function that compares the model's predicted probability distribution, , to the target distribution.
Consider two different sets of model parameters, θ₁ and θ₂:
- With parameters θ₁, the model's distribution predicts 'brightly' with a high probability.
- With parameters θ₂, the model's distribution predicts 'darkly' with a high probability.
Which of the following statements correctly analyzes the relationship between the parameters and the loss function for this specific input?
Interpreting a Model's Training Step
Comparing Model Performance via Loss