Multiple Choice

A development team aims to align a large language model with the complex value of 'being helpful'. Their strategy is to create a high-quality dataset of 50,000 question-and-answer pairs where the model's response is rated as 'very helpful' by human annotators. They then fine-tune the model with the sole objective of maximizing its ability to reproduce these exact 'very helpful' answers. Which statement best evaluates the fundamental limitation of this data-fitting approach for achieving the team's goal?

0

1

Updated 2025-10-05

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Evaluation in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science