Populating a k-NN Datastore for Language Modeling
A crucial step in using a k-NN-based memory model is deciding what key-value pairs to store in its datastore. For standard language modeling, a common strategy is to use the sequence's history as the context, meaning the key and value vectors of all previously processed tokens are added to the datastore.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
k-NN Memory Retrieval
Integrating k-NN Memory with Local Memory in Attention
Populating a k-NN Datastore for Language Modeling
Equivalence Between k-NN and Sparse Attention Models
k-NN Language Modeling (k-NN LM)
Vector Database
A language model is designed to be a question-answering assistant for a large corporate knowledge base containing thousands of separate project documents. A user asks a question about 'Project Alpha,' but the most relevant technical detail needed to answer it is located in a document for 'Project Zeta,' a completely unrelated past project. Which statement best explains the unique advantage of using a k-nearest neighbors (k-NN) based external memory system in this scenario?
Analyzing Long-Range Consistency in Language Models
In a k-NN based external memory system, the datastore of key-value pairs is limited to representing only the context states from the current, single sequence being processed.
Learn After
Extending k-NN Datastore Context with a Training Dataset
A language model equipped with a k-NN based memory is processing the sentence: 'The quick brown fox'. The model processes the sentence one word at a time, from left to right. When the model is about to predict the next word after 'brown', which of the following best describes the contents of its memory datastore according to the standard sequential population strategy?
Implications of Sequential Datastore Population
A language model is equipped with a k-NN based memory that is populated using the standard strategy of storing the history of the current sequence. The model begins processing a new sequence with an empty datastore. Arrange the following events in the correct chronological order as the model processes the first two tokens.