Multiple Choice

A large language model is tasked with processing a very long input document. To prepare for generating a response, it computes the Key-Value cache for the entire document in a single, large forward pass before any new tokens are produced. What is the most significant computational challenge or trade-off inherent to this 'all-at-once' approach?

0

1

Updated 2025-09-29

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science