Multiple Choice

A development team fine-tunes a large language model on a custom-built dataset of 50,000 technical support chat logs to improve its ability to resolve customer issues. The fine-tuned model achieves near-perfect accuracy on a test set composed of 5,000 additional logs from the same original source. However, when deployed to handle live customer chats, which include new and unforeseen types of user problems, the model's performance is significantly worse. Based on this scenario, which challenge associated with this improvement method is the most probable cause for the performance drop?

0

1

Updated 2025-09-26

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Evaluation in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science