Case Study

Configuring Memory Component Weights

A team is developing a language model that uses a memory component to keep track of recent information. The memory is calculated as a pair of summary vectors using the following weighted moving average formula over the last n_c key (k) and value (v) vectors at position i:

Mem = ( (Σ_{j=i-n_c+1}^{i} β_{j-i+n_c} k_j) / (Σ_{j=1}^{n_c} β_j), (Σ_{j=i-n_c+1}^{i} β_{j-i+n_c} v_j) / (Σ_{j=1}^{n_c} β_j) )

The team's goal is for the model to assign greater importance to the most recent information within its memory window compared to older information. Based on the structure of this formula, describe the characteristic that the weight vector β = [β_1, β_2, ..., β_{n_c}] should have to achieve this goal, and explain your reasoning.

0

1

Updated 2025-10-03

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science