Model Selection for Long-Document Summarization
A software development team is building a feature to summarize entire novels, which often contain over 100,000 words. They propose using a standard Transformer-based model for this task. Based on the computational properties of the model's core mechanism, evaluate the feasibility of this approach and justify your conclusion.
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Evaluation in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Language Model Performance Analysis
A developer observes that a standard Transformer-based language model takes approximately 2 seconds to process a text sequence of 500 tokens. Based on the computational properties of the model's core mechanism, what is the most likely processing time if the input sequence length is doubled to 1000 tokens?
Model Selection for Long-Document Summarization