Case Study

Evaluating Pre-training Task Relevance

An NLP team is developing a model for a specific task: determining if a given concluding paragraph is a logical follow-up to a preceding introductory paragraph. They are considering two pre-trained foundational models:

  • Model Alpha: Pre-trained by predicting randomly hidden words within sentences.
  • Model Beta: Pre-trained using the same method as Model Alpha, but with an additional objective: predicting whether two input sentences appeared consecutively in the original training text.

Which model, Alpha or Beta, is better suited for the team's specific task? Justify your choice by explaining how the pre-training objectives of the selected model align with the requirements of the downstream task.

0

1

Updated 2025-10-06

Contributors are:

Who are from:

Tags

Data Science

Foundations of Large Language Models Course

Computing Sciences

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Ch.2 Generative Models - Foundations of Large Language Models

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science

Related