Multiple Choice

A research team is fine-tuning a language model for a text summarization task. The model uses a positional encoding scheme where a scalar hyperparameter, β, adjusts the strength of a distance-based bias in the attention mechanism. The team experiments with different values for β and records the model's performance on a validation set using the ROUGE score (higher is better). The results are as follows:

β ValueROUGE Score
0.010.35
0.10.42
1.00.38
10.00.29

Based on this data, what is the most reasonable conclusion?

0

1

Updated 2025-10-10

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science