Template Leakage Counts in Canonical Prerequisite QA Splits
The canonical prerequisite QA splits used by this paper are heavily templated and share substantial surface forms across train and test. On LectureBank-Full the canonical split shares exact train–test questions and train–test target concepts. On MOOC-CS it shares exact questions and target concepts. These counts motivate reporting question-disjoint and target-concept-disjoint controls rather than relying on the canonical splits, because a system can otherwise match memorized surface forms.
0
1
Tags
Science
Auditable Strict-Parity Evaluation of Prerequisite-Graph Retrieval for RAG under Leakage Controls
Related
Template Leakage Counts in Canonical Prerequisite QA Splits
Target-Disjoint Reported in Main Paper, Question-Disjoint in Supplement
Residual Target Overlap on Question-Disjoint Prerequisite Splits (4 LB-Full, 16 MOOC-CS)
Target-Concept-Disjoint Control Preserves Graph, Encoder, and Retrieval Hyperparameters
Two Question Families Bound the External Validity of Curated Prerequisite QA