Multiple Choice

A development team is tasked with aligning a new chatbot to be helpful and harmless. Instead of building a reward model from the ground up, they opt to use a large, state-of-the-art, publicly available language model to score the chatbot's responses. What is the primary reason this 'off-the-shelf' strategy is often highly effective?

0

1

Updated 2025-10-03

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science