Learn Before
A language model has processed the input sequence 'The sun is shining and the sky is' and must now predict the next word. It computes a probability for several words in its vocabulary. Given the formula next_word = argmax_{word in Vocabulary} P(word | 'The sun is shining and the sky is') and the following probability outputs, which word will the model select?
- P('blue' | context) = 0.85
- P('green' | context) = 0.05
- P('running' | context) = 0.02
- P('0.85' | context) = 0.08
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A language model has processed the input sequence 'The sun is shining and the sky is' and must now predict the next word. It computes a probability for several words in its vocabulary. Given the formula
next_word = argmax_{word in Vocabulary} P(word | 'The sun is shining and the sky is')and the following probability outputs, which word will the model select?- P('blue' | context) = 0.85
- P('green' | context) = 0.05
- P('running' | context) = 0.02
- P('0.85' | context) = 0.08
Iterative Application of Argmax for Next Token Prediction
A language model is predicting the next token for the sequence 'The weather is'. It calculates that the probability for the token 'sunny' is 0.78, which is the highest probability for any token in its vocabulary. The selection process is defined by the formula:
predicted_token = argmax_{token in Vocabulary} P(token | 'The weather is'). Based on this information, the output of theargmaxoperation is the numerical value 0.78.Interpreting the Argmax Function in Token Selection
Left-to-Right Token Generation Process