Characterizing a Retrieval-Based Probability Distribution
A language model uses a retrieval mechanism to improve its predictions. To predict the next word for the prefix 'The cat chased the mouse under the...', the model finds the 5 most similar contexts from a large text collection. The words that followed these 5 contexts were: sofa, chair, sofa, sofa, table. Based only on this set of 5 retrieved words, describe the key characteristics of the new probability distribution that would be formed over the vocabulary. Specifically, which words would have high, low, and zero probability?
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Aggregated Distance Calculation for k-NN Vocabulary Distribution
Linear Interpolation of k-NN and LLM Distributions
Characterizing a Retrieval-Based Probability Distribution
A k-Nearest Neighbors Language Model (k-NN LM) is generating text and needs to predict the next token. It queries its datastore and retrieves the 5 nearest reference tokens, along with their corresponding distances: {"river": 0.1}, {"stream": 0.2}, {"river": 0.3}, {"ocean": 0.8}, {"river": 0.9}. How are these retrieved tokens and their distances used to construct a new probability distribution over the model's vocabulary?
Evaluating a k-NN LM's Intermediate Output