Learn Before
Applying a Segmentation Strategy for Long-Form Audio
Imagine you are tasked with using a transformer model to generate a transcript for a three-hour-long audio recording. The standard model can only process up to 30 seconds of audio at a time due to memory constraints. Describe a practical, step-by-step method you could implement to transcribe the entire three-hour recording using this constrained model, ensuring the final transcript is coherent.
0
1
Tags
Data Science
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Sequence Parallelism
A team is tasked with using a transformer-based model to summarize an entire book. The standard model architecture cannot process the entire book's text at once due to its length. The team implements a strategy where the book is broken into smaller, manageable chunks, each chunk is processed by the model, and the outputs are then combined. What is the fundamental computational bottleneck in the standard architecture that this segmentation strategy is designed to circumvent?
Analyzing a Hierarchical Transformer for Genomic Data
Applying a Segmentation Strategy for Long-Form Audio