Sequence Ordering

An LLM inference system uses a prefix cache with a fixed capacity. The cache is currently full. A new user request arrives with a prefix that is not present in the cache (a 'cache miss'). To make space for this new prefix, the system must evict an existing one based on the Least Recently Used (LRU) policy. Arrange the following actions in the correct chronological order.

0

1

Updated 2025-10-06

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Computing Sciences

Foundations of Large Language Models Course

Application in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science