Learn Before
A team of engineers is tasked with optimizing a large language model for real-time text summarization of news articles. They observe that the model's processing time is a major bottleneck. To address this, they implement a mechanism that, for each article, dynamically decides to skip processing certain less-informative sentences entirely, thereby reducing the total amount of text fed through the model's most computationally expensive components. Which principle of efficient model inference does this approach best exemplify?
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Depth-Adaptive BERT Models
Length-Adaptive BERT Models
A team of engineers is tasked with optimizing a large language model for real-time text summarization of news articles. They observe that the model's processing time is a major bottleneck. To address this, they implement a mechanism that, for each article, dynamically decides to skip processing certain less-informative sentences entirely, thereby reducing the total amount of text fed through the model's most computationally expensive components. Which principle of efficient model inference does this approach best exemplify?
Match each description of an efficiency technique for language models with the type of dynamic network it represents.
Optimizing a Language Model for Varied Task Complexity