Multiple Choice

A research team is developing a system to generate summaries of scientific articles. They are concerned that the quality of the summary is highly sensitive to the specific phrasing of the instruction given to the language model. They compare two methods to address this sensitivity:

  • Method A: The team manually creates 10 different, high-quality instructions, generates a summary for each, and then averages the results to produce a final summary.
  • Method B: The team uses a model that mathematically treats the instruction as a variable and integrates over the entire distribution of all possible instructions to produce a single, final summary.

Based on these descriptions, which method is inherently more robust against variations in instruction phrasing, and why?

0

1

Updated 2025-09-28

Contributors are:

Who are from:

Tags

Ch.3 Prompting - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science