Learn Before
Next-Token Prediction with External Memory
A language model is processing the sentence: 'The marine biologist cataloged the species found near the hydrothermal ___'. The model's internal representation for this context is used to search a large datastore of previously encountered contexts and their corresponding next words. The search returns the following top three most similar contexts and the words that followed them:
- Context: '...deep-sea submersible approached the volcanic ___' -> Next Word: 'vent'
- Context: '...unusual life forms thrive around the sulfuric ___' -> Next Word: 'vent'
- Context: '...the camera captured footage of the abyssal ___' -> Next Word: 'plain'
Based on its initial training alone, the model's most likely predictions were 'reef', 'area', and 'zone'. How will the information retrieved from the datastore influence the model's final prediction, and what is the underlying principle for this adjustment?
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Inference Architecture of k-NN Language Models
Next-Token Prediction with External Memory
A language model is enhanced by searching a large datastore of past internal states and their corresponding next words. When the model generates a new word, it finds the 'k' most similar past states from the datastore and uses their associated next words to adjust its prediction. What is the key principle that makes this technique effective?
Foundational Principle of k-NN Language Modeling
A language model is designed to improve its next-word predictions by consulting a large external database of past contexts. Arrange the following steps to accurately describe how this model generates its final output after receiving an input.
A language model is designed to enhance its next-token prediction by referencing a large external datastore of context representations and their corresponding subsequent tokens. During generation, for a given input, the model identifies the 'k' most similar context representations from this datastore. Which of the following best describes how this information is integrated to produce the final prediction?
Youâre on-call for an internal engineering assista...
You are reviewing two proposed designs for an inte...
Your team is building an internal âRelease Notes Q...
Youâre designing an internal LLM assistant for a c...
Design Review: Choosing Between RAG and k-NN LM for a Regulated Support Assistant
Post-Incident Analysis: Why a RAG Assistant Hallucinated Despite âHaving the Docsâ
Architecture Decision Memo: Unifying Vector-DB RAG and k-NN LM for a Global Policy Assistant
Case Study: Root-Cause Analysis of âCorrect Source, Wrong Answerâ in a RAG + k-NN LM Assistant
Case Study: Debugging a RAG Assistant with a Vector DB and a k-NN LM Memory
Case Review: Diagnosing Conflicting Answers in a Hybrid Retrieval System