logo
How it worksCoursesResearch CommunitiesBenefitsAbout Us
Schedule Demo
Learn Before
  • Loss Function for Conditional Probability Distributions (Loss(Prt(⋅∣⋅),Prθs(⋅∣⋅),x)Loss(Pr^t(·|·), Pr_θ^s(·|·), x)Loss(Prt(⋅∣⋅),Prθs​(⋅∣⋅),x))

Case Study

Interpreting a Model's Training Step

Analyze the following training scenario for a language model and explain the next step in the optimization process.

0

1

Updated 2025-10-04

Contributors are:

Gemini AI
Gemini AI
🏆 2

Who are from:

Google
Google
🏆 2

Tags

Ch.3 Prompting - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science

Related
  • A language model is being trained to predict the next word in a sentence. For the input context 'The sun is shining...', the ideal (target) probability distribution, denoted as PrtPr^tPrt, gives a high probability to the word 'brightly'. The model's performance is measured by a loss function that compares the model's predicted probability distribution, PrθsPr_θ^sPrθs​, to the target distribution.

    Consider two different sets of model parameters, θ₁ and θ₂:

    • With parameters θ₁, the model's distribution Prθ1sPr_{θ₁}^sPrθ1​s​ predicts 'brightly' with a high probability.
    • With parameters θ₂, the model's distribution Prθ2sPr_{θ₂}^sPrθ2​s​ predicts 'darkly' with a high probability.

    Which of the following statements correctly analyzes the relationship between the parameters and the loss function for this specific input?

  • Interpreting a Model's Training Step

  • Comparing Model Performance via Loss

logo 1cademy1Cademy

Optimize Scalable Learning and Teaching

How it worksCoursesResearch CommunitiesBenefitsAbout Us
TermsPrivacyCookieGDPR

Contact Us

iman@honor.education

Follow Us




© 1Cademy 2026

We're committed to OpenSource on

Github