Analyzing Prediction Outcomes via Neighbor Distances
A language model retrieves the 5 nearest neighbors from its datastore to help predict the next word. The retrieved items, consisting of a token value and its distance from the current context, are listed below. Based on this information, analyze and explain why the model's final prediction is more likely to be 'ocean' than 'sea'. Your explanation must focus on how the aggregated distance for each of these two specific words is determined from the neighbors, assuming the aggregated distance for a given word is the minimum distance found among the neighbors with that word as their value.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Softmax-based k-NN Probability Distribution
Calculating Aggregated Distances from Nearest Neighbors
A k-NN Language Model retrieves the 4 nearest neighbors from its datastore for a given query hidden state. The retrieved neighbors, their corresponding token values, and their distances to the query are listed below:
- Neighbor 1: Value = 'cat', Distance = 0.2
- Neighbor 2: Value = 'dog', Distance = 0.3
- Neighbor 3: Value = 'cat', Distance = 0.5
- Neighbor 4: Value = 'fish', Distance = 0.6
Based on this information, what is the aggregated distance, , for the vocabulary token 'cat'?
Analyzing Prediction Outcomes via Neighbor Distances