Multiple Choice

A development team uses a 13-billion parameter language model to summarize legal documents. To ensure accuracy, they decide to use a separate, more powerful 70-billion parameter model to act as a verifier. The verifier model is prompted to check if the summary contains all key points from the original document. Which of the following represents the most critical evaluation challenge inherent in this 'LLM-as-verifier' strategy?

0

1

Updated 2025-10-06

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Evaluation in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science