1Cademy - A research team is training a language model to generate helpful and harmless dialogue responses. They define a utility function for a given input `x` and a generated response `y` as: `U(x, y) = (0.8 * Helpfulness_Score) - (0.2 * Harmfulness_Score)`. The teams objective is to find the model parameters, `θ`, that maximize the average utility across a large dataset of interactions. Which of the following loss functions, `L(θ)`, should the team minimize to achieve this objective?

Learn Before

Language Model Loss as Negative Expected Utility

Multiple Choice

A research team is training a language model to generate helpful and harmless dialogue responses. They define a utility function for a given input x and a generated response y as: U(x, y) = (0.8 * Helpfulness_Score) - (0.2 * Harmfulness_Score). The team's objective is to find the model parameters, θ, that maximize the average utility across a large dataset of interactions. Which of the following loss functions, L(θ), should the team minimize to achieve this objective?

Updated 2025-10-02

Contributors are:

Who are from:

Learn Before

Related