1Cademy - A research lab has developed a very large, powerful teacher language model that excels at complex, multi-step reasoning tasks. They want to deploy this reasoning capability in a mobile application, which requires a much smaller, faster student model. Using the principles of knowledge distillation, what would be the most effective training objective for the student model to ensure it learns the *reasoning process* of the teacher, not just the final answers?

Learn Before

Knowledge Distillation for Reasoning

Multiple Choice

A research lab has developed a very large, powerful 'teacher' language model that excels at complex, multi-step reasoning tasks. They want to deploy this reasoning capability in a mobile application, which requires a much smaller, faster 'student' model. Using the principles of knowledge distillation, what would be the most effective training objective for the student model to ensure it learns the reasoning process of the teacher, not just the final answers?

Updated 2025-10-02

Contributors are:

Who are from:

Learn Before

Related