Learn Before
Comparing Model Confidence via Probability Normalization
Two language models, Model A and Model B, are tasked with predicting the next word for the same context. They both consider the same set of three candidate words. The unnormalized scores they produce are listed below. Analyze the outputs of both models. After normalizing the scores for each model to create a probability distribution over the candidate set, determine which model is more 'confident' in its top prediction and justify your answer based on the calculated probabilities.
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Evaluation in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Conditional Probability Formula for Autoregressive Models using Softmax
A language model is predicting the next word in a sequence. After processing the context, it has assigned the following unnormalized scores to a set of four candidate words: 'mat' (score=6.0), 'rug' (score=3.0), 'floor' (score=0.5), and 'chair' (score=0.5). To convert these scores into a valid probability distribution over this set, what is the final probability assigned to the word 'mat'?
A language model is evaluating three candidate tokens (A, B, C) to follow a given context. Initially, their scores are: Token A = 4, Token B = 4, Token C = 2. If the score for Token C is increased to 12, while the scores for Token A and Token B remain unchanged, how does this affect the normalized probabilities of Token A and Token B?
Comparing Model Confidence via Probability Normalization
Softmax Function