1Cademy - In an autoregressive Transformer model, generating a sequence in response to an input prompt involves two distinct phases from the perspective of the Key-Value (KV) cache. Which option correctly distinguishes the computational activities of these two phases?

Learn Before

Two-Phase Inference from a KV Cache Perspective

Multiple Choice

In an autoregressive Transformer model, generating a sequence in response to an input prompt involves two distinct phases from the perspective of the Key-Value (KV) cache. Which option correctly distinguishes the computational activities of these two phases?

Updated 2025-10-02

Contributors are: