Learn Before
Concept

Context Scaling

Context scaling is an inference-time compute scaling method that improves large language model performance by extending the input or context provided to the model. By incorporating more helpful context during inference, the model can condition its predictions on prior information. Approaches to context scaling include extending the prompt with input-output examples (few-shot prompting), encouraging intermediate reasoning steps (chain-of-thought prompting), and dynamically incorporating external knowledge from a database (Retrieval-Augmented Generation).

0

1

Updated 2026-05-06

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences