1Cademy - Evaluating Prompts with Pre-defined Metrics

Learn Before

Evaluation of Candidate Prompts in Prompt Search

Concept

Evaluating Prompts with Pre-defined Metrics

A method for assessing the quality of a candidate prompt involves using it to generate an output from a Large Language Model for a given input, and then applying a pre-defined metric to evaluate the performance of that output on the specific task.

Updated 2026-04-30

Contributors are:

Who are from:

References

Reference of Foundations of Large Language Models Course
Reference of Foundations of Large Language Models Course

Learn After

A team is developing a prompt for a Large Language Model to summarize medical research papers. Their primary goal is to ensure the generated summaries are factually accurate and do not misrepresent the findings of the original paper. To automate the evaluation of different prompts, they need to choose a single, pre-defined metric. Which of the following metrics would be the most appropriate for the team to use to assess their primary goal?
Critiquing an Automated Prompt Evaluation Setup
You are evaluating a single candidate prompt to see how well it instructs a language model on a specific task. Arrange the following steps into the correct sequence for assessing the prompt's performance on one given input using a pre-defined metric.

Learn Before

Related

Learn After