Multiple Choice

A development team is refining a language model to generate more helpful responses. They have a collection of user prompts but lack a corresponding set of 'gold standard' correct answers. However, they do have an automated system that can assign a numerical 'helpfulness' score to any response the model generates for a given prompt. To improve the model, the team needs to define a loss function for this training phase. Which of the following best describes the principle they should use to formulate this loss function?

0

1

Updated 2025-10-01

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science