Learn Before
General Language Modeling Objective based on Joint Log-Probability
A general and foundational approach to language modeling involves treating an input x and an output y as a single, concatenated sequence. The training objective in this framework is to model the joint log-probability of the sequence, log Prθ(x, y). This is accomplished by minimizing a loss function that is calculated over all tokens in the combined sequence, [x, y], based on the chain rule of probability.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Conditional vs. Joint Probability Objectives in Language Modeling
Relationship Between Joint, Conditional, and Marginal Log-Probabilities of Sequences
General Language Modeling Objective based on Joint Log-Probability
A language model is being used to determine the likelihood of a specific sentence. Let the input sequence
xbe 'The sun is' and the output sequenceybe 'shining brightly'. The notationPr([x, y])represents the probability of the model generating the full, combined sequence. Which statement best analyzes what this probability value signifies?Analysis of Sequence Order on Joint Probability
Conditional Log-Probability via Joint and Marginal Log-Probabilities
Model Comparison Using Joint Sequence Probability
Learn After
Conditional vs. Joint Probability Objectives in Language Modeling
A language model is being trained with the objective of modeling the joint probability of an input sequence
xand an output sequencey, which are treated as a single, concatenated sequence. During a single training step for this combined sequence, how is the model's performance error (loss) calculated?Evaluating a Training Objective for a Base Model
A language model is being trained with the objective of modeling the joint probability of a combined sequence
[x, y]. For this objective, the model's parameters are updated based only on its ability to correctly predict the tokens in the output sequencey.