Learn Before
A language model is evaluated on a sequence of four tokens, (x_0, x_1, x_2, x_3). The model's performance is measured by calculating a loss value at each step of the sequence generation. The individual losses are as follows: the loss for predicting token x_1 is 1.2, the loss for predicting x_2 is 0.5, and the loss for predicting x_3 is 2.3. Based on this information, what is the total loss for the entire token sequence?
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Ch.3 Prompting - Foundations of Large Language Models
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Pre-training Objective for Language Models
Example of a Token Sequence
Example of an Indexed Token Sequence
A language model is evaluated on a sequence of four tokens,
(x_0, x_1, x_2, x_3). The model's performance is measured by calculating a loss value at each step of the sequence generation. The individual losses are as follows: the loss for predicting tokenx_1is 1.2, the loss for predictingx_2is 0.5, and the loss for predictingx_3is 2.3. Based on this information, what is the total loss for the entire token sequence?Comparative Model Performance Analysis
A language model's performance is being evaluated on the token sequence
('The', 'cat', 'sat', 'on'). The total loss for this sequence is calculated by summing the individual losses from each predictive step. Which of the following sets of predictions contributes to this total loss calculation?Ground-Truth Distribution as a One-Hot Representation