Calculating Next-Token Probability
An autoregressive model is generating a sequence of text. After processing the context, it must decide on the next token. The model has computed the following unnormalized scores (logits) for a small set of candidate tokens: {'apple': 2.0, 'banana': 3.0, 'cherry': 1.0}. Based on these scores, calculate the conditional probability for the token 'banana'. Show the main components of your calculation.
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Token Sampling from a Conditional Probability Distribution
Calculating Next-Token Probability
An autoregressive model is generating a sequence and has computed the following unnormalized scores (logits) for three candidate next tokens: Token A (3.0), Token B (1.0), and Token C (0.0). If a constant value of 10.0 is added to each of these three logits before the final probability normalization step, how will the resulting conditional probabilities for the tokens be affected?
An autoregressive language model calculates unnormalized scores (logits) for a set of candidate next tokens. These scores are then transformed into a probability distribution. What is the primary reason for applying an exponential function to each logit before the final normalization step?