1Cademy - Extending k-NN Datastore Context with a Training Dataset

Learn Before

Populating a k-NN Datastore for Language Modeling

Concept

Extending k-NN Datastore Context with a Training Dataset

An alternative to using only the current sequence for context is to populate the k-NN datastore with key-value pairs from a larger collection of sequences, such as an entire training dataset. This approach enables a Large Language Model to leverage a more generalized context for making predictions.

Updated 2026-04-23

Contributors are: