Learn Before
Pre-training Objective Formula
The objective of pre-training a model on a set of sequences, denoted as , is formally expressed as finding the optimal parameters that minimize the total loss across the dataset. This is represented by the equation: where is the loss evaluated on a single sequence .
0
1
Tags
Foundations of Large Language Models
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Probability Computation with Pre-trained Language Models
A language model is being trained on a large dataset of text. After an initial training iteration, the model's performance is measured on three distinct sequences from the dataset, yielding the following loss values:
- Sequence 1: Loss = 8.4
- Sequence 2: Loss = 2.1
- Sequence 3: Loss = 5.5
Based on the fundamental objective of this training process, which of the following statements most accurately describes the model's overall goal?
Evaluating Model Training Progress
From Single Sequence to Full Dataset
The primary objective of pre-training a language model on a dataset is to find a unique, optimal set of model parameters for each individual text sequence within that dataset.
Pre-training Objective Formula