Multiple Choice

A language model's ability to generalize to new tasks is evaluated using a set of 5 new instruction-input pairs. The model's performance on each pair is scored on a scale of 0 to 1, yielding the scores: [0.9, 0.8, 0.3, 0.2, 0.7]. According to the formal condition for inter-task generalization, which is defined as the average performance over the new set exceeding a threshold (ε), does this model demonstrate this capability if the threshold is set at ε = 0.6?

0

1

Updated 2025-10-04

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science