Example

Linear Combination of Local and External Attention

When incorporating both a local memory, Mem\mathrm{Mem}, and a retrieved long-term memory, Memknn\mathrm{Mem}_{k\mathrm{nn}}, into a language model, one architectural approach is to process them in separate attention steps. As exemplified by the model developed by Wu et al. (2021), the outputs from the local attention mechanism and the external kk-NN attention mechanism can then be linearly combined to produce the final representation.

0

1

Updated 2026-04-23

Contributors are:

Who are from:

Tags

Foundations of Large Language Models

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences