Case Study

Analyzing Model Training Loss

A language model is being trained on a binary classification task to determine if sentence B is the actual sentence that follows sentence A. Consider two different training examples and the model's predictions for the correct label in each case. Based on the standard negative log-likelihood loss function used for such tasks, which example would result in a higher loss value, and why?

0

1

Updated 2025-10-03

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science