Multiple Choice

A research team fine-tunes two identical large language models. Model A is fine-tuned exclusively on 100,000 examples of text summarization, each presented as an instruction. Model B is fine-tuned on a dataset of the same total size (100,000 examples), but this dataset is a mix of summarization, translation, and question-answering tasks, all framed as instructions. When tested on a completely new task—sentiment analysis—Model B performs significantly better than Model A, which fails almost completely. What is the most likely reason for Model B's superior ability to generalize to the new task?

0

1

Updated 2025-10-05

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science