Essay

Critique of Metric Selection for a Creative LLM

A team is developing a large language model designed to generate creative and original poetry. To evaluate its performance, they are primarily using an 'exact match' accuracy metric, which calculates the percentage of generated poems that are identical, word-for-word, to a pre-written set of reference poems. Critically evaluate the suitability of this metric for this specific application. Justify your reasoning by explaining the potential limitations of this approach and what characteristics a more appropriate metric might have.

0

1

Updated 2025-10-03

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Evaluation in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science