A language model is being trained with the objective of modeling the joint probability of an input sequence x and an output sequence y, which are treated as a single, concatenated sequence. During a single training step for this combined sequence, how is the model's performance error (loss) calculated?
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Conditional vs. Joint Probability Objectives in Language Modeling
A language model is being trained with the objective of modeling the joint probability of an input sequence
xand an output sequencey, which are treated as a single, concatenated sequence. During a single training step for this combined sequence, how is the model's performance error (loss) calculated?Evaluating a Training Objective for a Base Model
A language model is being trained with the objective of modeling the joint probability of a combined sequence
[x, y]. For this objective, the model's parameters are updated based only on its ability to correctly predict the tokens in the output sequencey.