Learn Before
Essay

Pre-training Objective Selection

A research team is developing a language model for a complex task that requires both generating coherent, long-form text and accurately filling in missing information within existing drafts. They are considering three pre-training objectives:

  1. An objective that predicts the next word in a sequence based only on the words that came before it.
  2. An objective that predicts randomly masked words in a sentence by looking at all the other visible words, both before and after the mask.
  3. An objective that shuffles the order of words in a sentence and then predicts them one by one in that new shuffled order.

Evaluate which of these three objectives is most suitable for the team's dual requirements. Justify your choice by explaining its advantages and the primary limitations of the other two for this specific scenario.

0

1

Updated 2025-10-06

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Evaluation in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science