1Cademy - A language model is evaluated on a sequence of four tokens, `(x_0, x_1, x_2, x_3)`. The models performance is measured by calculating a loss value at each step of the sequence generation. The individual losses are as follows: the loss for predicting token `x_1` is 1.2, the loss for predicting `x_2` is 0.5, and the loss for predicting `x_3` is 2.3. Based on this information, what is the total loss for the entire token sequence?

Learn Before

Total Loss Calculation for a Token Sequence

Multiple Choice

A language model is evaluated on a sequence of four tokens, (x_0, x_1, x_2, x_3). The model's performance is measured by calculating a loss value at each step of the sequence generation. The individual losses are as follows: the loss for predicting token x_1 is 1.2, the loss for predicting x_2 is 0.5, and the loss for predicting x_3 is 2.3. Based on this information, what is the total loss for the entire token sequence?

Updated 2025-09-28

Contributors are:

Who are from:

Learn Before

Related