Learn Before
An attention mechanism uses a neural network to maintain a memory of the information it has processed. Arrange the following events in the correct chronological order for a single update step of this memory component.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Comprehension in Revised Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Formula for Neural Network Memory Update
A computational model is designed to process a sequence of items one by one. To keep a running summary of the sequence, it uses a specific neural network as a memory component. At each step, this network updates its internal state. Suppose at step
t=5, the memory network has just produced an output representing its state, which we'll callMem_prior. The main model has also processed the fifth item in the sequence, resulting in a current state representation calledS_current. To generate the new memory state for the next step, what inputs should be fed into the memory network?An attention mechanism uses a neural network to maintain a memory of the information it has processed. Arrange the following events in the correct chronological order for a single update step of this memory component.
An engineer is developing a text summarization model that processes a document sentence by sentence. The model uses a special neural network as a memory component to keep track of the document's overall context. The engineer observes that the model generates excellent summaries for short articles but produces incoherent summaries for long articles, often forgetting information from the initial paragraphs. The main model components responsible for processing individual sentences are confirmed to be working correctly. Based on this observation, which of the following is the most likely malfunction within the memory component's update process?