1Cademy - Using Optimized Predictions as Learning Targets

Learn Before

Prediction via Optimization

Concept

Using Optimized Predictions as Learning Targets

In certain training methodologies, the target for the learning process is generated by the model itself rather than being a pre-existing ground truth label. This involves identifying the output that maximizes an objective function, such as a log-probability. This optimized output is then used as the target for a loss function, which in turn guides the model's parameter updates.

Updated 2026-01-15

Contributors are:

Who are from:

References

Reference of Foundations of Large Language Models Course

Learn After

Log-Probability Loss with Model-Generated Target
A research team is training a generative model using a method where the learning target for any given input is the output that the model itself currently calculates as having the highest probability. This self-generated target is then used to update the model's parameters. Which statement best analyzes a key implication of this training approach?
Self-Reinforcing Training Strategy for a Chatbot
Contrasting Learning Target Methodologies

Learn Before

Related

Learn After