Concept

Multiple Memory Models in Attention

The architecture of memory-based models can be extended to incorporate more than one memory component, motivated by the observation that both local and long-term contexts are valuable for attention models. This approach allows for a more sophisticated handling of context by using distinct memories to manage different types of information, such as separating short-term local context from a compressed summary of long-term historical data.

Image 0

0

1

Updated 2026-04-23

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences