Case Study

Model Parameter Selection via Likelihood

A language model is being trained on a small dataset consisting of two sequences. You are evaluating two different sets of model parameters, θA\theta_A and θB\theta_B. The model's calculated log-probabilities for each sequence under each parameter set are shown in the table below. Based on the principle of selecting parameters that maximize the total log-probability across the entire dataset, which parameter set should be chosen? Justify your answer by showing the calculation for each set.

0

1

Updated 2025-10-04

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Ch.2 Generative Models - Foundations of Large Language Models

Ch.4 Alignment - Foundations of Large Language Models

Application in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science