Learn Before
Calculating Model Training Loss
A model is being trained for a token classification task. For a given four-token input sequence, the correct tags have been identified. The model has produced a probability for the correct tag at each of the four positions. Your task is to calculate the average negative log-likelihood loss for this entire sequence based on the data provided below. Use the natural logarithm (ln) for your calculations and show the main steps.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Calculating Model Training Loss
A model is being trained for a text labeling task where the goal is to maximize the probability assigned to the correct label for each word. The training loss is calculated as the average of the negative logarithm of these probabilities. Consider the model's performance on one sentence, evaluated by two different sets of parameters (Model A and Model B). The table below shows the probability each model assigned to the correct label for each of the seven words in the sentence.
Word Model A Probability Model B Probability Word 1 0.9 0.8 Word 2 0.8 0.6 Word 3 0.7 0.6 Word 4 0.9 0.8 Word 5 0.9 0.8 Word 6 0.1 0.7 Word 7 0.9 0.8 Based on this data, which model would have a lower training loss for this specific sentence, and why?
Impact of Model Confidence on Training Loss