Short Answer

Analyzing Fine-Tuning Methodologies

A research team is fine-tuning a language model to summarize news articles. The model is trained to first extract key sentences from the article and then generate a summary based on them. The fine-tuning process provides a reward signal based on two criteria: (1) the factual accuracy of the final summary compared to a human-written one, and (2) whether the intermediate sentences it extracted match a predefined list of 'golden' key sentences. Based on the principles of fine-tuning, explain why this approach is not a purely outcome-based method.

0

1

Updated 2025-10-06

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science