Learn Before
A model is trained using a two-stage process. In the first stage, given an input context c, the model identifies an optimal output sequence, ŷ. In the second stage, the model's parameters are updated to maximize the probability of generating that same sequence ŷ, but this time conditioned on a slightly modified version of the original context, c'. What is the primary reason for using the modified context c' in the second stage instead of the original context c?
0
1
Tags
Ch.3 Prompting - Foundations of Large Language Models
Foundations of Large Language Models
Computing Sciences
Foundations of Large Language Models Course
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Analysis of a Self-Supervised Training Strategy
A model is trained using a two-stage process. In the first stage, given an input context
c, the model identifies an optimal output sequence,ŷ. In the second stage, the model's parameters are updated to maximize the probability of generating that same sequenceŷ, but this time conditioned on a slightly modified version of the original context,c'. What is the primary reason for using the modified contextc'in the second stage instead of the original contextc?Consider a training process where the objective function is defined as
Loss = log Pr(ŷ | c', z), withŷbeing an optimal prediction generated by the model itself. During training, the model's parameters are updated with the goal of minimizing this specificLossvalue.