Formula

Loss Function for Conditional Probability Distributions (Loss(Prt(),Prθs(),x)Loss(Pr^t(·|·), Pr_θ^s(·|·), x))

This formula represents a generic loss function used to train a model. It calculates the discrepancy between a target conditional probability distribution, denoted as Prt()Pr^t(·|·), and a parameterized model's predicted distribution, Prθs()Pr_{\theta}^s(·|·), for a given input x\mathbf{x}. The goal of training is typically to adjust the parameters θ\theta to minimize this loss, thereby making the model's distribution PrθsPr_{\theta}^s as close as possible to the target distribution PrtPr^t. This framework is common in tasks like knowledge distillation, where a 'student' model (s) learns from a 'teacher' model (t).

Image 0

0

1

Updated 2025-10-08

Contributors are:

Who are from:

Tags

Ch.3 Prompting - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences