Learn Before
Multiple Choice

A development team creates a prompt for a text summarization task and meticulously refines it for 'Language Model Alpha', achieving a 95% performance score. When they deploy the exact same prompt on 'Language Model Beta', a different model of comparable overall capability, the performance score drops to 70%. What is the most likely reason for this discrepancy?

0

1

Updated 2025-10-02

Contributors are:

Who are from:

Tags

Ch.3 Prompting - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science