Learn Before
Evaluating Inference-Time Scaling Strategies
A research lab is developing a medical diagnosis assistant using a large language model. They are considering two inference-time strategies.
Strategy A: Use the single largest available model with a very long context window to process the entire patient history at once, aiming for the most comprehensive single analysis.
Strategy B: Use a smaller, faster model but run it multiple times with different prompts that focus on different aspects of the patient's history (e.g., symptoms, lab results, family history), then have a separate aggregation model synthesize the results.
Evaluate which strategy better embodies the broader definition of inference-time scaling, justifying your answer by referencing concepts like robustness and exploration.
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Evaluation in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Integration of Scaling Dimensions in Output Ensembling
A team of engineers is using a language model to generate code for a complex function. Instead of accepting the first output, they prompt the model five separate times with slight variations in the instructions and then use a voting system to select the most reliable and functional code snippet from the five generated options. Which dimension of inference-time performance is this strategy primarily designed to enhance?
Evaluating Inference-Time Scaling Strategies
Match each inference-time strategy with the primary dimension of performance it is designed to enhance, according to a broader definition of scaling.