Case Study

Selecting a Training Method for a Summarization Model

A data scientist is training a model to generate concise summaries of multi-sentence news articles. The training process involves showing the model a 'damaged' version of an article and teaching it to reconstruct the original, complete article. The data scientist is considering two different methods for 'damaging' the input articles:

  • Method 1: Randomly reordering the sentences within each article, but keeping the words within each sentence in their original order.
  • Method 2: Keeping the sentences in their original order, but randomly replacing 15% of the individual words throughout the article with a special placeholder symbol.

Analyze the two methods. Which method is more likely to train a model that is better at the final goal of summarization, and why? Your explanation should compare how each method helps the model learn different aspects of language.

0

1

Updated 2025-10-01

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science