Learn Before
Retrieving Reference Tokens in k-NN LM Inference
During inference in a -nearest neighbors (-NN) language model, the process begins with the model's hidden state representation for a given prefix, denoted as . This representation is used to search the datastore for the closest matching data items, which take the form of key-value tuples: . The retrieved values serve as reference tokens, guiding the model's prediction of the subsequent token based on the prefix representation .
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Retrieving Reference Tokens in k-NN LM Inference
A language model architecture is designed to predict the next token by using two parallel computational streams that originate from the same query vector. The first stream uses the immediate, local context to generate a probability distribution over the vocabulary. The second stream uses the query vector to search a large external datastore, find the most similar historical contexts, and generate a second probability distribution based on the tokens that followed those contexts. The two distributions are then combined to produce the final prediction. What is the primary functional distinction between the information provided by these two streams?
Visual Representation of k-NN Language Model Inference
Diagnosing an Error in a Hybrid Language Model
A language model architecture enhances its predictions by combining information from its immediate context with knowledge from a large external repository. Arrange the following steps to accurately describe the data flow during its inference process.
Learn After
Using Reference Tokens to Define a Vocabulary Distribution in k-NN LM
Role of Internal State in Datastore Search
A language model enhanced with a nearest-neighbor mechanism needs to find relevant information from its external datastore to help predict the next word. Arrange the following steps in the correct chronological order to describe how the model retrieves this information.
A language model enhanced with a nearest-neighbor search mechanism is generating text. The model's current internal state, representing the prefix 'The scientist made a groundbreaking...', is used as a query to search an external datastore. The datastore contains pairs of (context representation, associated word). If the search retrieves the three words 'discovery', 'advance', and 'finding' as reference tokens, which statement most accurately describes how these specific words were selected?