1Cademy - Language Model Loss as Negative Expected Utility

Learn Before

General Objective for Parameter Optimization via Loss Minimization

Formula

Language Model Loss as Negative Expected Utility

In the context of language modeling, the loss function can be defined as the negative expectation of a utility function $U$ . The objective is to find the parameters $\theta$ that minimize this loss. The formula is given by: $\mathcal{L}(\theta) = -\mathbb{E}_{(\mathbf{x},\mathbf{y})\sim\mathcal{D}}[U(\mathbf{x},\mathbf{y};\theta)]$ Here, the expectation $\mathbb{E}$ is calculated over pairs of inputs $\mathbf{x}$ and outputs $\mathbf{y}$ sampled from a dataset or distribution $\mathcal{D}$ . Minimizing this loss is equivalent to maximizing the expected utility provided by the model.