1Cademy - Evaluating Model Training Progress

Learn Before

Pre-training Objective for Language Models

Case Study

Evaluating Model Training Progress

A machine learning engineer is pre-training a language model on a dataset. They evaluate the model's performance at two different stages of training (after 100 epochs and after 200 epochs) using a representative sample of three sequences from the dataset. The loss for each sequence is recorded in the table below.

Sequence ID	Loss at 100 Epochs	Loss at 200 Epochs
Sequence A	5.2	4.1
Sequence B	6.8	3.5
Sequence C	4.5	4.2

Based on the fundamental objective of the pre-training process, which version of the model (at 100 epochs or 200 epochs) is performing better? Justify your choice by referencing the data and the overall goal of training.

0

1

Updated 2025-10-02

Contributors are:

Who are from:

Learn Before

Related