Shifting Perspectives on Recurrent Models
A machine learning practitioner from the early 2010s stated, 'Recurrent models are fundamentally unsuited for tasks requiring an understanding of dependencies across very long sequences.' Explain why this statement was largely considered true at the time, and describe the key change that has made modern variants of these models effective for such tasks.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Computing Sciences
Foundations of Large Language Models Course
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
An engineering team is building a system to process and understand the full text of lengthy legal documents, which can be tens of thousands of words long. One engineer argues against using any model with a recurrent structure, citing their historical inability to capture relationships between distant parts of a text. A second engineer suggests that specific, modern variants of recurrent models are well-suited for this exact challenge. Based on the development of these architectures over time, which of the following assessments is most accurate?
Shifting Perspectives on Recurrent Models
The core operational principle of summarizing an input sequence into a fixed-size set of hidden states is the fundamental reason why even the most advanced recurrent model variants remain inherently less effective than other architectural approaches for tasks involving very long-distance dependencies.