True/False

In the context of training a large language model to generate creative stories, increasing the frequency of automated reward signals (e.g., a reward for every grammatically correct sentence) is always more effective than providing a single, holistic quality rating from a human expert at the end of the entire story.

0

1

Updated 2025-10-06

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Evaluation in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science