Multiple Choice

An inference system for a large language model uses a cache for text prefixes to speed up processing. The cache has a capacity of 3 slots and uses a Least Recently Used (LRU) eviction policy. The cache is currently full, and its state, from most recently used to least recently used, is as follows:

  1. Prefix A: "The capital of France is"
  2. Prefix B: "Translate the following sentence to German:"
  3. Prefix C: "Once upon a time in a land far away,"

Now, a new user request arrives with the prompt: "The capital of France is Paris." This request is a 'hit' for Prefix A. Immediately after, another request arrives with a new, uncached prefix: "Summarize the main points of the article below:". To store this new prefix, one of the existing prefixes must be evicted. Which prefix will be removed from the cache?

0

1

Updated 2025-09-28

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Computing Sciences

Foundations of Large Language Models Course

Application in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science