Multiple Choice

A team is deploying a large language model to generate chapter-length summaries of scientific papers. They observe that the time required to generate a summary increases dramatically with the length of the input paper, and the process often fails due to 'out of memory' errors on their hardware, even when processing one paper at a time. Which component of the model's architecture is the most direct cause of this specific performance scaling issue?

0

1

Updated 2025-10-02

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science