An encoder-decoder model is being trained on a denoising task. Its goal is to reconstruct an original sentence from a corrupted version. During one training step, the model must generate the target sentence: 'The quick brown fox jumps.' The model generates the following output, one word at a time: 'The quick brown foxx jumps.' Based on how the training loss is typically computed for this type of task, which statement best describes how the error signal is calculated?
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
An encoder-decoder model is being trained on a denoising task. Its goal is to reconstruct an original sentence from a corrupted version. During one training step, the model must generate the target sentence: 'The quick brown fox jumps.' The model generates the following output, one word at a time: 'The quick brown foxx jumps.' Based on how the training loss is typically computed for this type of task, which statement best describes how the error signal is calculated?
You are training an encoder-decoder model on a denoising task. For a single training example, arrange the following steps in the correct order to describe how the total loss is calculated for the target output sequence.
Analyzing Training Loss in a Sequence Generation Task