Learn Before
Multiple Choice

A startup is building a system to automatically categorize legal contracts into specific sub-types (e.g., 'lease agreement', 'employment contract', 'non-disclosure agreement'). They have a very small, private dataset of 500 labeled contracts. Their proposed strategy is to first train a large neural network on a massive, publicly available dataset of millions of labeled news articles, classifying them by topic (e.g., 'sports', 'politics', 'technology'). After this initial training, they plan to adapt the model to their legal contract categorization task. What is the most significant weakness of this proposed pre-training approach for their specific goal?

0

1

Updated 2025-10-01

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Evaluation in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science