1Cademy - Loss Function for Conditional Probability Distributions ($Loss(Pr^t(·|·), Pr

Learn Before

Definition of Student's Probability Distribution ( $P_\theta^s$ )

Formula

Loss Function for Conditional Probability Distributions ( $Loss(Pr^t(·|·), Pr_θ^s(·|·), x)$ )

This formula represents a generic loss function used to train a model. It calculates the discrepancy between a target conditional probability distribution, denoted as $Pr^t(·|·)$ , and a parameterized model's predicted distribution, $Pr_{\theta}^s(·|·)$ , for a given input $\mathbf{x}$ . The goal of training is typically to adjust the parameters $\theta$ to minimize this loss, thereby making the model's distribution $Pr_{\theta}^s$ as close as possible to the target distribution $Pr^t$ . This framework is common in tasks like knowledge distillation, where a 'student' model (s) learns from a 'teacher' model (t).

0

1

Updated 2025-10-08

Contributors are:

Who are from:

References

Reference of Foundations of Large Language Models Course

Learn Before

Related

Learn After