Multiple Choice

A language model generates a three-segment response to a user's prompt. A separate reward model evaluates each segment, considering the full context of the prompt and the complete response, and assigns the following scores: Segment 1: 0.8, Segment 2: -0.3, Segment 3: 0.5. According to the principle of aggregating segment-based scores, what is the total reward for the entire generated response?

0

1

Updated 2025-10-02

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Application in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science