Concept

Weighted Moving Average for Memory Component

To give varying levels of importance to past information, a weighted moving average can be used to create summary vectors for the memory component (Mem\mathrm{Mem}). This method applies different weights, or coefficients (β1,,βnc\beta_1, \dots, \beta_{n_c}), to the key and value vectors within the attention window. The specific values for these coefficients can be either learned as model parameters or determined via heuristics.

0

1

Updated 2026-04-22

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Related