Evaluating a Decoding Strategy Proposal
A junior engineer proposes a new method for a language model to generate 20-token summaries. Their method guarantees finding the absolute most probable sequence by evaluating every single possible 20-token combination. The model has a vocabulary of 50,000 tokens. Briefly explain the primary computational challenge this proposal faces and why it is not a feasible strategy in practice.
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Search Space Pruning in LLM Decoding
A language model with a vocabulary of 30,000 unique tokens is generating a response. If the model were to perform a complete, exhaustive search to find the absolute best possible 5-token sequence, which calculation represents the total number of unique sequences it would need to evaluate?
Evaluating a Decoding Strategy Proposal
Decoding Strategy Post-Mortem