1Cademy - An instruction-tuned model is evaluated on a specific task: summarizing legal documents. The goal is to achieve intra-task generalization, which is formally defined as the average performance on a set of new inputs (Z) exceeding a predefined threshold (ε). The evaluation uses a set of 100 new legal documents the model has never seen before. The performance threshold (ε) is set to 0.85. - **Model A** correctly summarizes 92 of the 100 new documents. - **Model B** correctly summarizes 81 of the

Learn Before

Informal Definition of Intra-Task Generalization

Multiple Choice

An instruction-tuned model is evaluated on a specific task: summarizing legal documents. The goal is to achieve intra-task generalization, which is formally defined as the average performance on a set of new inputs (Z) exceeding a predefined threshold (ε).

The evaluation uses a set of 100 new legal documents the model has never seen before. The performance threshold (ε) is set to 0.85.

Model A correctly summarizes 92 of the 100 new documents.
Model B correctly summarizes 81 of the

Updated 2025-10-02

Contributors are:

Who are from:

Learn Before

Related