Definition

Leakage Audit: Train-Test Overlap Check in Prerequisite-QA Retrieval

A leakage audit is a methodological check that verifies whether the train and test partitions of a prerequisite-QA benchmark share material that would inflate retrieval scores. Specifically, it inspects two kinds of overlap: (1) question-string overlap, where the same or near-duplicate question text appears on both sides of the split, and (2) target-concept overlap, where held-out target concept IDs also appear as supervision targets in training. The audit produces an on-disk report so that headline retrieval numbers can be re-interpreted under explicit overlap controls.

0

1

Updated 2026-05-17

Contributors are:

Who are from:

Tags

Science

Auditable Strict-Parity Evaluation of Prerequisite-Graph Retrieval for RAG under Leakage Controls