Learn Before
Comparative Model Performance Analysis
Two language models, Model Alpha and Model Beta, are tasked with predicting the next token in a sequence. Their performance is measured on the same five-token input sequence. The individual loss values for each of the four predictions made by the models are recorded below. Based on these values, which model performed better on this specific sequence? Justify your conclusion.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Ch.3 Prompting - Foundations of Large Language Models
Evaluation in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Pre-training Objective for Language Models
Example of a Token Sequence
Example of an Indexed Token Sequence
A language model is evaluated on a sequence of four tokens,
(x_0, x_1, x_2, x_3). The model's performance is measured by calculating a loss value at each step of the sequence generation. The individual losses are as follows: the loss for predicting tokenx_1is 1.2, the loss for predictingx_2is 0.5, and the loss for predictingx_3is 2.3. Based on this information, what is the total loss for the entire token sequence?Comparative Model Performance Analysis
A language model's performance is being evaluated on the token sequence
('The', 'cat', 'sat', 'on'). The total loss for this sequence is calculated by summing the individual losses from each predictive step. Which of the following sets of predictions contributes to this total loss calculation?Ground-Truth Distribution as a One-Hot Representation