Learn Before
Sequence-Level Loss
A sequence-level loss is a type of loss function that computes the total error over an entire data sequence. Instead of evaluating each element of the sequence independently, it aggregates the individual losses from each step—often through summation or averaging—to provide a single measure of the model's performance on the whole sequence.
0
1
Tags
Ch.3 Prompting - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Sample-wise Negative Log-Likelihood Loss for a Sub-sequence
Sequence-Level Loss
An engineer is training a model on a large dataset. They are monitoring two metrics:
- Metric A: A value calculated for each individual data sample. This value fluctuates significantly from one sample to the next.
- Metric B: A single, aggregate value calculated after the model has processed the entire training dataset. This value shows a steady, downward trend over multiple passes through the dataset.
Based on the standard terminology for measuring a model's performance, what is the most accurate way to classify these two metrics?
Interpreting Training Metrics
Match each term to its most accurate description regarding how a model's performance is measured during training.
Learn After
Loss Function for RNN
Sample-wise Negative Log-Likelihood Loss for a Sub-sequence
Cross-Entropy Loss for Knowledge Distillation
A language model is being trained to generate the four-word sentence 'The quick brown fox'. The model generates one word at a time, and the error (loss) is calculated at each step:
- Loss for 'The' = 0.1
- Loss for 'quick' = 0.3
- Loss for 'brown' = 0.2
- Loss for 'fox' = 0.4
To update the model's parameters, the training process computes a single, overall loss value for the entire sentence. Which statement best analyzes this method of calculating the overall loss?
Total Loss Calculation for a Token Sequence
Calculating Average Sequence-Level Loss
Evaluating Training Strategies for a Translation Model