Example

HotpotQA FullWiki-1k Boundary Probe: Flat 93.4, Hierarchical 92.9, Adaptive 94.0 R@10

An auxiliary HotpotQA boundary probe in the paper's supplement evaluates whether adaptive depth gating offers a distinctive advantage on open-domain multi-hop QA. On the aligned FullWiki-1k slice, Recall@1010 is 93.493.4 for Flat dense, 92.992.9 for the Hierarchical baseline, and 94.094.0 for Adaptive (heuristic) depth gating. The three systems are statistically close, so the probe finds no distinctive adaptive-depth advantage beyond an already-strong flat dense retriever. The Limitations section reads this as evidence that the curated prerequisite ranking does not transfer cleanly to a denser non-prerequisite graph, so the probe is treated as a boundary condition rather than as a headline result.

0

1

Updated 2026-05-17

Contributors are:

Who are from:

Tags

Science

Auditable Strict-Parity Evaluation of Prerequisite-Graph Retrieval for RAG under Leakage Controls