Example

Token-Cap Comparison on LectureBank-Full: Adaptive Loses More as Cap Tightens

Under a controlled token-cap protocol where both Adaptive + token-aware and Hierarchical + token-aware are reranked with the same token-aware serialization policy, the paired-bootstrap delta ΔR@10=\Delta\text{R@}10 = Adaptive − Hierarchical on LectureBank-Full (n=143n = 143) is negative and grows more negative as the cap is relaxed: 0.9-0.9 [2.4,+0.3][-2.4, +0.3] at a 256-token cap (p=0.191p = 0.191), 1.7-1.7 [3.6,0.1][-3.6, -0.1] at 512 tokens (p=0.045p = 0.045), and 3.4-3.4 [6.2,0.5][-6.2, -0.5] at 768 tokens (p=0.025p = 0.025). At the strictest 256-token cap the two systems are not individually distinguishable under the descriptive paired bootstrap; the individually significant negative deltas appear at the looser 512 and 768 caps. The reported pp-values are descriptive paired-bootstrap tail probabilities and are not adjusted across token caps, so the trend is directional rather than a multiplicity-corrected significance result.

0

1

Updated 2026-05-17

Contributors are:

Who are from:

Tags

Science

Auditable Strict-Parity Evaluation of Prerequisite-Graph Retrieval for RAG under Leakage Controls

Related