Multiple Choice

An engineer is debugging an autoregressive language model and observes that as it generates longer sequences, its output progressively loses connection to the initial context. The engineer suspects a flaw in how the attention mechanism utilizes the Key-Value (KV) cache during each generation step. Based on the process where a new query attends to the full, updated cache, which of the following errors is the most probable cause for this specific type of performance degradation?

0

1

Updated 2025-10-10

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science