Learn Before
Designing a Performance Metric for Summarization Prompts
Imagine you are building an automated system to find the most effective prompt for a language model that summarizes complex scientific papers for a high school audience. The goal is to produce summaries that are both accurate and easy to understand. Describe a concrete, automatable method your system could use to score the quality of summaries generated by different prompts. What specific, measurable criteria would your scoring method rely on?
0
1
Tags
Ch.3 Prompting - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Creation in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Example of Performance Estimation in Prompt Optimization
An engineering team is building an automated system to discover the most effective instructions for a language model to generate Python code. The system generates thousands of instruction variations and needs a way to determine which one is the best. The team observes that the system often selects instructions that produce code that looks syntactically correct but fails to run due to subtle errors. Which part of this automated discovery process is most likely the source of this issue?
Designing a Performance Metric for Summarization Prompts
Critiquing a Flawed Prompt Evaluation Method