Learn Before
Auto-Regressive Decoding in Machine Translation
Auto-regressive decoding is a process used in tasks like machine translation where each token in the target-language output sequence is generated sequentially. The generation of each new token is conditioned on two sources of information: the tokens that have already been generated in the target sequence and the complete source-language input sequence.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Encoder
Decoder
Context vector
Encoder-Decoder with Transformers
Multi-lingual Pre-training for Encoder-Decoder Models
Mathematical Formulation of an Encoder-Decoder Model
Seq2seq Models for Text Generation
Auto-Regressive Decoding in Machine Translation
Applying Encoder-Decoder Architectures to NLP via the Text-to-Text Framework
A sequence-to-sequence model is designed to translate English sentences into French. When given the English input, 'The quick brown fox jumps over the lazy dog,' the model produces the French output, 'Où est la bibliothèque?' ('Where is the library?'). The generated French sentence is grammatically perfect and fluent, but it is completely unrelated to the meaning of the English input. Based on this specific failure, which component of the underlying architecture is most likely the primary source of the error?
Diagnosing an Architectural Flaw in a Summarization Model
Arrange the following events to accurately describe the flow of information in a standard encoder-decoder architecture for a sequence-to-sequence task.
Your team is pretraining an internal T5-style enco...
Your company wants one internal model to support m...
Your team is pretraining an internal T5-style mode...
Your team is building a single internal T5-style t...
Diagnosing a T5-Style Model That Ignores Task Prefixes After Span-Denoising Pretraining
Choosing Between Span-Denoising Pretraining and Task-Specific Fine-Tuning in a T5-Style Text-to-Text System
Designing a Unified Text-to-Text Model and Pretraining Objective for Multiple NLP Features
Root-Cause Analysis of a T5-Style Model Producing Fluent but Unfaithful Outputs
Selecting an Architecture and Pretraining Objective for a Unified Internal NLP Service
Post-Pretraining Data Formatting Bug in a T5-Style Text-to-Text Service
Pre-training Encoder-Decoder Models via Masked Language Modeling
Learn After
A machine translation system generates an output sentence one word at a time, where each new word is chosen based on the original input sentence and the words already generated. When translating a long, complex sentence, the system produces an output that starts accurately but then devolves into a repetitive, nonsensical phrase (e.g., '...the boy is going to the is going to the is going to the...'). What is the most likely flaw in the generation process that would cause this specific type of error?
A machine translation model is translating the English sentence 'The cat sat' into French. It has already generated the tokens 'Le' and 'chat'. Arrange the following actions in the correct order to generate the very next token in the sequence.
Explaining Divergent Translations