Learn Before
Multiple Choice

A developer is designing a language model for summarizing very long legal documents, where details mentioned at the beginning can be crucial for the overall summary. To manage memory usage on a constrained hardware setup, the developer implements a self-attention mechanism that, for each new token, only considers the preceding 1024 tokens. What is the most significant trade-off for this specific application?

0

1

Updated 2025-09-26

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Evaluation in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science