Learn Before
Case Study

Analyzing Dual-Task Model Training Performance

An engineer is training a large language model using a dual-task objective. The total training loss is the sum of the losses from two individual tasks: Task A (predicting randomly hidden words in a text) and Task B (determining if two sentences appear consecutively in the original text). Analyze the training log below and explain which task the model appears to be mastering more quickly. Justify your answer by referencing the trends in the loss values.

0

1

Updated 2025-10-08

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science