Justification for Pre-training Task Classification
Explain why classifying pre-training tasks for language models by their objective (e.g., predicting masked tokens) is considered a more robust and conceptually coherent approach than classifying them by model architecture (e.g., encoder-only). Provide a specific example to support your reasoning.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A research team trains two different models: one with an encoder-only structure and another with an encoder-decoder structure. Both models are trained using the exact same objective: predicting randomly masked words within a text. A colleague argues that because the model structures are different, they should be classified under separate pre-training categories. Why is this argument fundamentally flawed from a conceptual standpoint?
Classifying a Novel Pre-training Method
Justification for Pre-training Task Classification