A research team is designing a new language model specifically for summarizing entire books, which involves processing extremely long sequences of text. Their primary constraint is a limited computational budget, which restricts both the training time and the memory available on their hardware. Which of the following architectural goals is most critical for the team to pursue to make their project feasible?
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Architectural Trade-offs for Long-Sequence Modeling
Evaluating Efficient Architectures for Long-Document Analysis
A research team is designing a new language model specifically for summarizing entire books, which involves processing extremely long sequences of text. Their primary constraint is a limited computational budget, which restricts both the training time and the memory available on their hardware. Which of the following architectural goals is most critical for the team to pursue to make their project feasible?