Learn Before
Choosing a Model Training Strategy
A research team wants to build a language model that excels at proofreading by understanding the grammatical and contextual relationships between all words in a sentence. They are considering two training methods:
Method 1: The model is given the first part of a sentence (e.g., 'The cat sat on the') and must predict the next word (e.g., 'mat').
Method 2: The model is given a full sentence with some words removed (e.g., 'The [BLANK] sat on the [BLANK]') and must predict the missing words (e.g., 'cat', 'mat').
Evaluate which method is more suitable for the team's goal. Justify your choice by explaining how the information available to the model during training differs between the two methods and why that difference is important for the proofreading task.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Evaluation in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Example of an Ordered Target Sequence for Masked Prediction
A language model is presented with the following input text where a portion is hidden: 'The sun shines brightly in the [X] sky.' Which of the following tasks best represents the model's primary objective if it is performing a masked prediction task?
Example of Masked Prediction: Kitten Chasing
Example of Masked Language Modeling: Kitten Playing
Example of a Simple Target Sequence
Example of Masked Prediction with a Known Verb
Example of Masked Prediction with Distinct Placeholders
Choosing a Model Training Strategy
Selecting a Training Objective for a Grammar-Focused Model