1Cademy - Evaluating Model Mimicry Performance

Learn Before

Function to Measure Differences Between Models

Case Study

Evaluating Model Mimicry Performance

Based on the provided outputs, which student model (A or B) is currently a better imitation of the teacher model for this specific input? Justify your reasoning by explaining how the dissimilarity function would likely interpret these distributions.

Updated 2025-10-07

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science

An engineering team is developing a compact, fast model to replicate the predictions of a much larger, more complex model for a 5-category classification task. They use a specific mathematical function to calculate a 'dissimilarity score' between the probability distributions produced by the two models for each input. A lower score indicates the outputs are more similar. After several training epochs, they observe the average dissimilarity score on a validation dataset has significantly decrease
A small, efficient model is being trained to emulate the behavior of a large, powerful model on a 3-category classification task. A mathematical function is used to calculate a 'dissimilarity score' between the probability distributions produced by the two models for a given input, where a higher score indicates a greater difference. For which of the following scenarios would this dissimilarity score be the highest?
Knowledge Distillation Loss using KL Divergence
Evaluating Model Mimicry Performance

Learn Before

Related