Multiple Choice

A research team is trying to find the most effective instruction to guide a large language model in summarizing legal documents. They first try asking the model to rate a list of 100 different candidate instructions on a scale of 1-10 for 'clarity and effectiveness'. They then discover that the model's ratings do not correlate well with which instructions actually produce the best summaries when tested. Furthermore, the process of generating and evaluating summaries for all 100 instructions is taking several days and consuming their entire computation budget. Which statement best analyzes the fundamental difficulties the team is facing?

0

1

Updated 2025-09-26

Contributors are:

Who are from:

Tags

Ch.3 Prompting - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science