Learn Before
Model Parameter Adjustment during Training
A model is being trained for a text-labeling task. For the input word 'Paris', the correct label is B-LOC. The model's output layer produces the following probability distribution for this word:
B-LOC: 0.3B-ORG: 0.6O: 0.1
Describe the primary goal of the training algorithm when it adjusts the model's internal parameters in response to this specific output.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Negative Log-Likelihood Loss for NER
A model for Named Entity Recognition is being trained. During one step, it processes a sentence and produces the probability distributions below for two of the words. The training process aims to adjust the model's parameters by calculating a loss based on the predicted probability of the correct, ground-truth tag for each word.
Word: 'Anya' (Ground-truth tag:
I-PER)B-PER: 0.05I-PER: 0.85O: 0.10
Word: 'Berlin' (Ground-truth tag:
B-LOC)B-LOC: 0.10B-ORG: 0.45O: 0.45
Based on this information, which word's prediction will contribute a larger value to the overall training loss for this step, and why?
Model Parameter Adjustment during Training
Consider a model being trained to assign a category tag (e.g., 'Person', 'Location', 'Other') to each word in a sentence. If, for a specific word, the model's output assigns a very high probability (e.g., 0.98) to the correct, ground-truth tag, the training process will make a large adjustment to the model's parameters based on this specific word's prediction.