Case Study

Critique and Revision of an Evaluation Prompt

A company is using human evaluators to rate the quality of two different AI-generated responses to a customer's message. Analyze the instructions provided to the evaluators below. Identify two significant weaknesses in these instructions and propose specific revisions to address them. Justify why your proposed changes would lead to more reliable and useful evaluations.

0

1

Updated 2025-10-04

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Evaluation in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science