1Cademy - CRUX: Controlled Retrieval-Augmented Context Evaluation for Long-Form RAG (Ju et al., 2025)

Learn Before

Research

Concept

CRUX: Controlled Retrieval-Augmented Context Evaluation for Long-Form RAG (Ju et al., 2025)

CRUX (Controlled Retrieval-aUgmented conteXt evaluation) is an evaluation framework introduced by Ju et al. (2025) for diagnosing the retrieval context supplied to long-form RAG systems. CRUX uses human-written summaries to control the information scope of the knowledge available to the retriever, then applies question-based, coverage-aware metrics (with explicit upper bounds) to measure how completely and how redundantly the retrieved context covers the information needed for long-form generation. Compared to standard relevance-ranking metrics, CRUX makes it possible to attribute differences in downstream generation to the retrieval context under matched conditions, and to identify whether a retriever leaves coverage gaps or returns redundant material. The framework is positioned as a reliable testbed for developing retrieval methods tailored to long-form RAG and is used in subsequent work as a protocol-sensitivity reference: RAG comparisons benefit from such controlled context evaluation rather than from end-to-end scores alone.

Updated 2026-05-18

Contributors are:

Who are from:

References

Reference: arxiv.org abs/2506.20051

Learn After

Evaluation-First Framing of Graph-Aware RAG Comparisons (Adopted from Zeng 2025, Han 2025, Ju 2025)

Learn Before

Related

Learn After