Multiple Choice

A research team is training a language model to generate helpful and harmless dialogue responses. They define a utility function for a given input x and a generated response y as: U(x, y) = (0.8 * Helpfulness_Score) - (0.2 * Harmfulness_Score). The team's objective is to find the model parameters, θ, that maximize the average utility across a large dataset of interactions. Which of the following loss functions, L(θ), should the team minimize to achieve this objective?

0

1

Updated 2025-10-02

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Application in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science