Multiple Choice

A language model is being fine-tuned on a dataset of instruction-response pairs. Consider the following training example:

  • Input: What is the capital of France?
  • Correct Response: Paris

The model processes the input and must predict the first token of the response. Below are two potential probability distributions (States A and B) that the model could generate for this first token at different points during training.

  • State A: {'Paris': 0.15, 'London': 0.10, 'The': 0.08, ...}
  • State B: {'Paris': 0.25, 'London': 0.05, 'The': 0.04, ...}

Based on the standard objective for this type of training, which statement provides the most accurate analysis?

0

1

Updated 2025-09-28

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science