Question-Disjoint and Target-Concept-Disjoint Splits in Prerequisite-QA Leakage Control
Two leakage-controlled splits for prerequisite-QA evaluation, stated as plain definitions at first use:
- A question-disjoint split removes any test instance whose question string (or near-duplicate) appears in training. This blocks gains that come from memorizing question text rather than reasoning over the prerequisite graph.
- A target-concept-disjoint split is strictly stronger: in addition to removing question-string overlap, it removes test items whose gold target concept IDs appear as training supervision. This blocks gains that come from having seen the held-out target concept during training.
The two splits are the concrete instruments by which the leakage audit's two overlap kinds (question-string and target-concept) are operationalized; they are used together in headline retrieval comparisons so that surviving gains can be attributed to the retrieval policy rather than to overlap with training material.
0
1
Tags
Science
Auditable Strict-Parity Evaluation of Prerequisite-Graph Retrieval for RAG under Leakage Controls
Related
Question-Disjoint and Target-Concept-Disjoint Splits in Prerequisite-QA Leakage Control
Complementary Contribution Scope Over the Graph-RAG Evaluation-First Lineage
SP+ Extension: Strict Parity Plus a Learned Reranker, Reported Separately
Question-Disjoint and Target-Concept-Disjoint Splits in Prerequisite-QA Leakage Control
Strict Parity (SP): Matched-Interface Evaluation Contract for Graph-Aware RAG Comparisons
Token-Cap Diagnostic: Shared Token-Aware Serialization Comparison