Calculating Prediction Loss
A language model is processing the input sequence 'The cat sat on the'. The correct next token is 'mat'. The model assigns a probability of 0.25 to the token 'mat' being the correct next token. Calculate the loss for this specific one-token prediction using the negative log-likelihood principle (using the natural logarithm, ln). Show your calculation and briefly explain what a lower value for this loss would signify about the model's prediction.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Ch.3 Prompting - Foundations of Large Language Models
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Selective Gradient Propagation for Sub-sequence Loss
A language model's performance on a single training sample is measured by calculating the negative logarithm of the probability it assigns to the correct target output sub-sequence, given an input sequence. Consider two models, Model A and Model B, being evaluated on the same sample. For this sample, Model A assigns a probability of 0.8 to the correct target sub-sequence, while Model B assigns a probability of 0.2. Based on this information, which statement correctly analyzes the models' performance on this specific sample?
Calculating Prediction Loss
Evaluating Model Performance on Different Samples