Learn Before
Dense Seed Pool with L2-Normalized all-MiniLM-L6-v2 Embeddings and Inner-Product Search
MOOC-CS: Language-Matched Controls (Results) in Auditable Strict-Parity Evaluation of Prerequisite-Graph Retrieval for RAG under Leakage Controls
Method Part 1: Multilingual Encoder + CJK Query Rewrite as a MOOC-CS Control (Auditable Strict-Parity Graph-RAG Paper)
Method Part 2: MOOC-CS Prerequisite Benchmark (Auditable Strict-Parity Graph-RAG Paper)
Graph Effect on MOOC-CS Is Conditional on Dense Seed Pool Quality
On MOOC-CS (), the paper concludes that the graph effect is conditional on the quality of the dense seed pool. Under MiniLM + English-template queries, the gap between flat-dense and hierarchical-baseline R@ is small (). Under multilingual + CJK-only queries, the same fixed-depth hierarchical policy widens that gap to , with adaptive close at . Because the graph policy is unchanged across rows, the much larger graph gain in the language-matched configuration is attributable to a better dense seed pool, not to a different traversal policy.
0
1
Tags
Science
Auditable Strict-Parity Evaluation of Prerequisite-Graph Retrieval for RAG under Leakage Controls
Related
Fixed Top-m Dense Seed Pool as a Strict-Parity Control
Graph Effect on MOOC-CS Is Conditional on Dense Seed Pool Quality
Flat Dense Retrieval Baseline in Strict-Parity Prerequisite Retrieval
Template Stripping on MOOC-CS Raises Hierarchical R@10 from 23.1 to 26.5 (MiniLM Encoder)
Multilingual Encoder Alone Does Not Improve MOOC-CS Recall (Hierarchical R@10 = 22.3 vs 23.1)
Multilingual Encoder + CJK-Only Queries Jumps MOOC-CS Hierarchical R@10 to 68.1
Graph Effect on MOOC-CS Is Conditional on Dense Seed Pool Quality
MiniLM Encoder + CJK-Only Queries on MOOC-CS: Hierarchical R@10 Rises from 23.1 to 26.5 with Flat Dense at 21.7
Multilingual Encoder + CJK-Only Queries Jumps MOOC-CS Hierarchical R@10 to 68.1
Template Stripping on MOOC-CS Raises Hierarchical R@10 from 23.1 to 26.5 (MiniLM Encoder)
Multilingual Encoder Alone Does Not Improve MOOC-CS Recall (Hierarchical R@10 = 22.3 vs 23.1)
MiniLM Encoder + CJK-Only Queries on MOOC-CS: Hierarchical R@10 Rises from 23.1 to 26.5 with Flat Dense at 21.7
Graph Effect on MOOC-CS Is Conditional on Dense Seed Pool Quality
Multilingual Encoder Alone Does Not Improve MOOC-CS Recall (Hierarchical R@10 = 22.3 vs 23.1)
Graph Effect on MOOC-CS Is Conditional on Dense Seed Pool Quality
MOOC-CS Error Taxonomy: Residual Failures Dominated by Distant Misses and Bilingual Aliasing
Language-Matched Seeding as a Prerequisite for Graph-Expansion Gains