Learn Before
A language model has produced a vector of raw, unnormalized scores for all possible next words in its vocabulary. If a data scientist adds a constant value of 10 to every single score in this vector, the final probability assigned to each word will change.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Output Probability Calculation in Transformer Language Models
A language model is tasked with predicting the next word for the sequence 'The cat sat on the'. After processing this input, the model's final linear layer produces a vector with 50,257 raw numerical scores, one for each word in its vocabulary. Which statement best characterizes this vector of raw scores, just before any final normalization function (like Softmax) is applied?
A language model has produced a vector of raw, unnormalized scores for all possible next words in its vocabulary. If a data scientist adds a constant value of 10 to every single score in this vector, the final probability assigned to each word will change.
Interpreting Model Output Scores