Learn Before
Search Scaling (Decoding Scaling)
Search scaling, or decoding scaling, is an inference-time compute scaling strategy that improves large language model performance by expanding the search process during decoding to find the optimal output sequence. This approach involves two primary dimensions: scaling the output length (increasing the number of generated tokens) and scaling the search space (broadening the set of candidate output sequences considered).
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Context Scaling
Search Scaling (Decoding Scaling)
A company deploys a pre-trained language model for real-time translation. To improve translation quality, they implement a new system where for each input sentence, the model generates three different translation options. A separate, computationally intensive process then runs to score these options and select the best one before it is shown to the user. Which statement best evaluates the most significant trade-off of this new system?
Strategies for Enhancing Code Generation
A development team enhances a language model's summarization capabilities by increasing the number of training epochs and using a larger, more powerful set of GPUs for the training process. This strategy is a clear example of improving model performance by adding computational resources during the inference phase.
Output Ensembling
Generating and Verifying Thinking Paths
Learn After
Benefit of Search Space Expansion in Complex Decoding Tasks
Computational Costs of Search Scaling
Scaling Output Length in Search Scaling
Scaling the Search Space in Search Scaling
An engineer is using a fixed, pre-trained language model to generate a complex travel itinerary. The initial outputs are often functional but fail to find the most optimal route. The engineer cannot alter the model's internal parameters. Which of the following adjustments to the generation process is a direct application of search scaling to find a better itinerary?
Applying Search Scaling Strategies
Analyzing Trade-offs in Inference-Time Search Configuration
Implicit Search Scaling in Search Procedures