Multiple Choice

An engineer is training a compact 'student' model to replicate the behavior of a larger 'teacher' model. The training process aims to minimize a loss function that measures the difference between the output probability distributions of the two models for any given input. If the loss value remains high throughout the training, what is the most direct conclusion?

0

1

Updated 2025-10-04

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Ch.3 Prompting - Foundations of Large Language Models

Application in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science