Case Study

Evaluating Model Performance on Different Samples

A language model is being fine-tuned on two different training samples. The model's goal is to minimize the negative log-likelihood loss for the correct output sub-sequence. Analyze the two scenarios below and determine on which sample the model performed worse. Justify your answer based on how the loss is calculated.

0

1

Updated 2025-10-08

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Ch.3 Prompting - Foundations of Large Language Models

Evaluation in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science