Learn Before
A language model architecture is designed to process a query by using two parallel computational streams: one that computes attention over a local memory of recent context, and another that searches an external datastore for relevant information. Match each architectural component with its primary function in this process.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Gated Combination of Local and k-NN Attention
An advanced language model is designed to be a conversational partner while also having access to a vast external knowledge base. When processing a user's query, the model employs a dual-path architecture:
- One path calculates attention over the recent conversational history (the "local context").
- A parallel path performs a similarity search on the external knowledge base to find the most relevant documents and then calculates attention over the content of those documents. The outputs from both paths are then integrated to form the final response.
What is the primary architectural advantage of processing local context and retrieved knowledge in two separate, parallel streams?
Architectural Solution for Long-Term Context
A language model architecture is designed to process a query by using two parallel computational streams: one that computes attention over a local memory of recent context, and another that searches an external datastore for relevant information. Match each architectural component with its primary function in this process.