In an accelerated text generation process, a sequence has been extended. The confirmed prefix is 'The cat sat on the', and two subsequent tokens, 'mat and', have just been accepted. The system now needs to generate the very next token. The underlying evaluation model provides the following probabilities for potential next tokens, given the full context 'The cat sat on the mat and':
P('looked') = 0.55 P('slept') = 0.25 P('waited') = 0.15 P('the') = 0.05
According to the principle of selecting the token with the highest probability from the evaluation model's distribution at this step, which token will be chosen next?
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Next Token Selection in an Accelerated Decoding Process
In an accelerated text generation process, a sequence has been extended. The confirmed prefix is 'The cat sat on the', and two subsequent tokens, 'mat and', have just been accepted. The system now needs to generate the very next token. The underlying evaluation model provides the following probabilities for potential next tokens, given the full context 'The cat sat on the mat and':
P('looked') = 0.55 P('slept') = 0.25 P('waited') = 0.15 P('the') = 0.05
According to the principle of selecting the token with the highest probability from the evaluation model's distribution at this step, which token will be chosen next?
In a speculative decoding process, after a sequence of
ndraft tokens has been verified and accepted, the very next token (at positionn+1) is generated by selecting the most likely token according to the draft model's probability distribution.