Learn Before
Designing an Experiment to Select a Pre-training Objective
Imagine you are leading a project to pre-train a new encoder-decoder model for the specific task of translating complex legal documents from English to German. You are considering three different input corruption strategies for the pre-training phase: random token masking, token deletion, and text infilling. Describe the experimental methodology you would design to empirically determine which of these three strategies is most effective for your specific downstream task. Your description should include the key steps of the experiment and the evaluation metrics you would use to compare the outcomes.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Creation in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Selecting a Pre-training Strategy for a Summarization Model
A research team is pre-training an encoder-decoder model specifically for the task of correcting complex grammatical errors and improving sentence structure in user-generated text. The team wants to select a pre-training objective that will best prepare the model for this downstream task. Which of the following input corruption strategies is most likely to be effective, and why?
Designing an Experiment to Select a Pre-training Objective