Learn Before
Rationale for Categorizing Pre-training Tasks by Objective
Categorizing pre-training tasks based purely on model architecture is suboptimal because a single training objective can be applied across varying architectures (e.g., masked language modeling works for both encoder-only and encoder-decoder setups). Therefore, a better approach is to classify these tasks based on their specific training objectives.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Contrastive Learning (CTL)
Extensions of PTMs
Applying and Adapting Pre-trained Models to Downstream Tasks
Unsupervised Pre-training
Supervised Pre-training
Self-Supervised Learning
Comparison of Pre-training Paradigms
Rationale for Categorizing Pre-training Tasks by Objective
Denoising Autoencoding
Comparability of Pre-training Tasks
Generality of Pre-training Tasks and Performance
Applying Pre-trained Models to Downstream Tasks
Identifying a Pre-training Strategy
Breadth of Pre-training Tasks
A research team is developing a new language model and is considering different pre-training approaches. Match each pre-training scenario below with the correct category of learning it represents.
A language model is being trained on a large corpus of text from the internet. The training process involves randomly hiding 15% of the words in each sentence and then tasking the model with predicting the original identity of these hidden words based on the surrounding context. Which category of pre-training task does this scenario best exemplify, and why?
Comparing Pre-training Task Categories
Comparison of Pre-training Tasks
Learn After
A research team trains two different models: one with an encoder-only structure and another with an encoder-decoder structure. Both models are trained using the exact same objective: predicting randomly masked words within a text. A colleague argues that because the model structures are different, they should be classified under separate pre-training categories. Why is this argument fundamentally flawed from a conceptual standpoint?
Classifying a Novel Pre-training Method
Justification for Pre-training Task Classification