Learn Before
Concept

CRUX: Controlled Retrieval-Augmented Context Evaluation for Long-Form RAG (Ju et al., 2025)

CRUX (Controlled Retrieval-aUgmented conteXt evaluation) is an evaluation framework introduced by Ju et al. (2025) for diagnosing the retrieval context supplied to long-form RAG systems. CRUX uses human-written summaries to control the information scope of the knowledge available to the retriever, then applies question-based, coverage-aware metrics (with explicit upper bounds) to measure how completely and how redundantly the retrieved context covers the information needed for long-form generation. Compared to standard relevance-ranking metrics, CRUX makes it possible to attribute differences in downstream generation to the retrieval context under matched conditions, and to identify whether a retriever leaves coverage gaps or returns redundant material. The framework is positioned as a reliable testbed for developing retrieval methods tailored to long-form RAG and is used in subsequent work as a protocol-sensitivity reference: RAG comparisons benefit from such controlled context evaluation rather than from end-to-end scores alone.

0

1

Updated 2026-05-18

Contributors are:

Who are from:

Tags

Science

Auditable Strict-Parity Evaluation of Prerequisite-Graph Retrieval for RAG under Leakage Controls

Related