Formula

Formula for Memory as a Weighted Moving Average of Keys and Values

The memory component, Mem\mathrm{Mem}, can be computed using a weighted version of the moving average of the last ncn_c key and value vectors. The weights, denoted by β\beta, are applied to each key-value pair. This calculation is formally expressed as: Mem=(j=inc+1iβji+nckjj=1ncβj,j=inc+1iβji+ncvjj=1ncβj)\mathrm{Mem} = \Big( \frac{\sum_{j=i - n_c + 1}^{i} \beta_{j - i + n_c} \mathbf{k}_{j}} {\sum_{j=1}^{n_c} \beta_j} , \frac{\sum_{j=i - n_c + 1}^{i} \beta_{j - i + n_c} \mathbf{v}_{j}}{\sum_{j=1}^{n_c} \beta_j} \Big)

Image 0

0

1

Updated 2026-04-22

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences