Aggregated Distance Calculation for k-NN Vocabulary Distribution
In determining the retrieval-based distribution for a -nearest neighbors (-NN) language model, a distance metric, , is defined relative to the vocabulary, . For a query's hidden state, , and a retrieved datastore key-value pair, , the value equals the distance between and if the token corresponds to the -th entry of the vocabulary . If does not match the -th entry, is set to .
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Aggregated Distance Calculation for k-NN Vocabulary Distribution
Linear Interpolation of k-NN and LLM Distributions
Characterizing a Retrieval-Based Probability Distribution
A k-Nearest Neighbors Language Model (k-NN LM) is generating text and needs to predict the next token. It queries its datastore and retrieves the 5 nearest reference tokens, along with their corresponding distances: {"river": 0.1}, {"stream": 0.2}, {"river": 0.3}, {"ocean": 0.8}, {"river": 0.9}. How are these retrieved tokens and their distances used to construct a new probability distribution over the model's vocabulary?
Evaluating a k-NN LM's Intermediate Output
Learn After
Softmax-based k-NN Probability Distribution
Calculating Aggregated Distances from Nearest Neighbors
A k-NN Language Model retrieves the 4 nearest neighbors from its datastore for a given query hidden state. The retrieved neighbors, their corresponding token values, and their distances to the query are listed below:
- Neighbor 1: Value = 'cat', Distance = 0.2
- Neighbor 2: Value = 'dog', Distance = 0.3
- Neighbor 3: Value = 'cat', Distance = 0.5
- Neighbor 4: Value = 'fish', Distance = 0.6
Based on this information, what is the aggregated distance, , for the vocabulary token 'cat'?
Analyzing Prediction Outcomes via Neighbor Distances