Learn Before
Context Scaling
Context scaling is an inference-time compute scaling method that improves large language model performance by extending the input or context provided to the model. By incorporating more helpful context during inference, the model can condition its predictions on prior information. Approaches to context scaling include extending the prompt with input-output examples (few-shot prompting), encouraging intermediate reasoning steps (chain-of-thought prompting), and dynamically incorporating external knowledge from a database (Retrieval-Augmented Generation).
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Context Scaling
Search Scaling (Decoding Scaling)
A company deploys a pre-trained language model for real-time translation. To improve translation quality, they implement a new system where for each input sentence, the model generates three different translation options. A separate, computationally intensive process then runs to score these options and select the best one before it is shown to the user. Which statement best evaluates the most significant trade-off of this new system?
Strategies for Enhancing Code Generation
A development team enhances a language model's summarization capabilities by increasing the number of training epochs and using a larger, more powerful set of GPUs for the training process. This strategy is a clear example of improving model performance by adding computational resources during the inference phase.
Output Ensembling
Generating and Verifying Thinking Paths
Learn After
Improving Narrative Coherence in AI-Generated Stories
A developer observes that a language model is generating summaries of long articles that lack detail and miss key points. To address this, they modify the inference process to provide the model with the full, unabridged article text instead of a shorter, pre-processed version. Which statement best analyzes why this modification is likely to improve the quality of the generated summary?
Evaluating Context Expansion for a Chatbot
Few-Shot Learning in Prompting
Chain-of-Thought (COT) Prompting
Retrieval-Augmented Generation (RAG)