Learn Before
Case Study

Next-Token Prediction with External Memory

A language model is processing the sentence: 'The marine biologist cataloged the species found near the hydrothermal ___'. The model's internal representation for this context is used to search a large datastore of previously encountered contexts and their corresponding next words. The search returns the following top three most similar contexts and the words that followed them:

  1. Context: '...deep-sea submersible approached the volcanic ___' -> Next Word: 'vent'
  2. Context: '...unusual life forms thrive around the sulfuric ___' -> Next Word: 'vent'
  3. Context: '...the camera captured footage of the abyssal ___' -> Next Word: 'plain'

Based on its initial training alone, the model's most likely predictions were 'reef', 'area', and 'zone'. How will the information retrieved from the datastore influence the model's final prediction, and what is the underlying principle for this adjustment?

0

1

Updated 2025-09-26

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Application in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science

Related