Learn Before
LightRAG Graph-Augmented Retrieval Framework (Guo et al., 2024)
LightRAG is a graph-augmented retrieval-augmented generation framework introduced by Guo, Xia, Yu, Ao, and Huang (University of Hong Kong, 2024). Indexing builds a knowledge graph of entities and their relations extracted from the source corpus, with each node and edge paired with a dense vector representation, so retrieval can combine graph structure with embedding similarity. Query-time retrieval is dual-level: a low-level path retrieves specific entities and their relationships from the graph, and a high-level path retrieves broader topics or themes that aggregate over multiple entities. The retrieved entity-and-relation context is then passed to an LLM generator. LightRAG also defines an incremental update procedure so that new documents can be folded into the entity-relation graph without re-indexing from scratch. The framework is released open-source and is positioned as a simpler, faster alternative to community-summary-based graph-RAG pipelines, making it the canonical reference for the 'LightRAG-style graph retriever' family.
0
1
Tags
Science
Related
Disciplinary Research
Legal Research
Historical Research
Scientific Research
Research References
Research Methods
Research Philosophy
Research Center
Experiment
Empirical Research Report
Reference: What Should I Learn First: Introducing LectureBank for NLP Education and Prerequisite Chain Learning
Reference: R-VGAE: Relational-variational Graph Autoencoder for Unsupervised Prerequisite Chain Learning
Reference: ojs.aaai.org
Reference: Prerequisite Relation Learning for Concepts in MOOCs
Reference: Course Prerequisite Relation (MOOC prerequisite dataset release page)
Reference: MOOCCube: A Large-scale Data Repository for NLP Applications in MOOCs
Reference: QASC: A Dataset for Question Answering via Sentence Composition
Reference: QASC: A Dataset for Question Answering via Sentence Composition (arXiv preprint)
Reference: ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction
Reference: ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT
Reference: arxiv.org
Reference: arxiv.org
Reference: Introduction to Information Retrieval
Reference: Evaluation measures (information retrieval)
Reference: REPLUG: Retrieval-Augmented Black-Box Language Models
Reference: arxiv.org
Reference: HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering
Reference: HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering (arXiv preprint)
Reference: HotpotQA Official Dataset and Leaderboard
Reference: Dense Passage Retrieval for Open-Domain Question Answering
Reference: Dense Passage Retrieval for Open-Domain Question Answering (arXiv preprint)
LectureBank Dataset
MOOC-CS Prerequisite Benchmark
QASC Question Answering Benchmark
Late-Interaction Neural Retrieval
Recall@k Retrieval Metric
RePlug Retrieval-Augmented Black-Box Language Model
HotpotQA Multi-Hop QA Benchmark
Single-Vector Dense Passage Retrieval
Reference: Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
Reference: How Significant Are the Real Performance Gains? An Unbiased Evaluation Framework for GraphRAG
Reference: arxiv.org
Unbiased GraphRAG Evaluation Framework (Zeng et al., 2025)
Reference: RAG vs. GraphRAG: A Systematic Evaluation and Key Insights
Reference: arxiv.org
RAG vs Graph-RAG Controlled Comparison (Han et al., 2025)
Reference: Controlled Retrieval-augmented Context Evaluation for Long-form RAG
Reference: Controlled Retrieval-augmented Context Evaluation for Long-form RAG (ACL Anthology)
CRUX Controlled RAG Context Evaluation (Ju et al., 2025)
Reference: Anytime Heuristic Search
Reference: The Anatomy of a Large-Scale Hypertextual Web Search Engine
Reference: Topic-Sensitive PageRank
Reference: Scaling Personalized Web Search
Reference: Local Graph Partitioning using PageRank Vectors
Reference: Monitoring and control of anytime algorithms: A dynamic programming approach
Reference: Monitoring and Control of Anytime Algorithms
Reference: jair.org
Reference: Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search
Reference: www.sciencedirect.com
Reference: dl.acm.org
Reference: ieeexplore.ieee.org
Reference: Monitoring and Control of Anytime Algorithms: A Dynamic Programming Approach
Anytime Algorithms
Reference: The anatomy of a large-scale hypertextual Web search engine
Reference: Recovering Concept Prerequisite Relations from University Course Dependencies
Reference: Measuring Prerequisite Relations Among Concepts
Reference: Inferring Concept Prerequisite Relations from Online Educational Resources
Reference: Inferring Concept Prerequisite Relations from Online Educational Resources (arXiv preprint)
Reference: PREREQ-IAAI-19 (official code release)
Reference: Heterogeneous Graph Neural Networks for Concept Prerequisite Relation Learning in Educational Data
Reference: aclanthology.org 2021.naacl-main.164
Pan et al. (2017) Prerequisite Relation Learning for Concepts in MOOCs
Reference: LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models
Reference: LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression
Reference: Packing the Meeting Summarization Knapsack
Reference: Resources for Brewing BEIR: Reproducible Reference Models and Statistical Analyses
Reference: Show Your Work: Improved Reporting of Experimental Results
Reference: We Need to Talk about Standard Splits
Reference: The Benchmark Lottery
Reference: Data Statements for Natural Language Processing: Toward Mitigating System Bias and Enabling Better Science
Reference: Datasheets for Datasets
Reference: Model Cards for Model Reporting
Reference: MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers
Reference: Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Reference: sentence-transformers/all-MiniLM-L6-v2 (Model Card)
Reference: arxiv.org abs/2002.10957
MiniLM Deep Self-Attention Distillation (Wang et al., 2020)
Reference: arxiv.org abs/1908.10084
Sentence-BERT Siamese Sentence Embedding Framework (Reimers & Gurevych, 2019)
Reference: From Local to Global: A Graph RAG Approach to Query-Focused Summarization
Reference: arxiv.org abs/2404.16130
Reference: LightRAG: Simple and Fast Retrieval-Augmented Generation
Reference: arxiv.org abs/2410.05779
Reference: KAG: Boosting LLMs in Professional Domains via Knowledge Augmented Generation
Reference: OpenSPG/KAG GitHub Repository
Reference: arxiv.org abs/2409.13731
KAG: Knowledge Augmented Generation Framework (Liang et al., 2024)
Reference: GraphRAG (open-source implementation)
GraphRAG Framework (Edge et al., 2024)
LightRAG Graph-Augmented Retrieval Framework (Guo et al., 2024)
Reference: Scaling Personalized Web Search (Stanford InfoLab Technical Report)
Reference: Bootstrap Methods: Another Look at the Jackknife
Reference: An Introduction to the Bootstrap
Reference: Statistical Significance Tests for Machine Translation Evaluation
Reference: projecteuclid.org journals/annals-of-statistics/volume-7
Nonparametric Bootstrap Resampling (Efron, 1979)
Reference: A Simple Sequentially Rejective Multiple Test Procedure
Reference: Multiple Hypothesis Testing
Family-Wise Error Rate (FWER)
Reference: Introduction to Information Retrieval, Chapter 8: Evaluation in information retrieval
Reference: dl.acm.org doi/10.1145/3626772.3657862
Reproducible IR Benchmarking and Evaluation-Variance Control
Reference: Improving Reproducibility in Machine Learning Research (A Report from the NeurIPS 2019 Reproducibility Program)
Reference: aclanthology.org q18-1041
Auditable-Artifact Reporting Standards: Data Statements, Datasheets, Model Cards, Reproducibility Checklists
Research Paper: Advanced Prompting
Reference: Advanced Prompting
Reference: Identification of Causal Effects Using Instrumental Variables
Reference: Econometric Analysis of Cross Section and Panel Data, 2nd Edition
Reference: Mostly Harmless Econometrics: An Empiricist's Companion
Reference: www.tandfonline.com doi/abs/10.1080
Reference: Assessing the Effect of an Influenza Vaccine in an Encouragement Design
Reference: Causal Inference, Path Analysis, and Recursive Structural Equations Models
Reference: academic.oup.com biostatistics/article-abstract/1
Reference: A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity
Reference: Some Heteroskedasticity-Consistent Covariance Matrix Estimators with Improved Finite Sample Properties
Reference: Using Heteroscedasticity Consistent Standard Errors in the Linear Regression Model
Reference: www.jstor.org stable/1912934
Reference: Identification and Estimation of Local Average Treatment Effects
Reference: Accounting for No-Shows in Experimental Evaluation Designs
Reference: journals.sagepub.com doi/10.1177/0193841x8400800205
Reference: www.econometricsociety.org publications/econometrica/1994
Reference: Authorship Attribution
Reference: Explanation in Computational Stylometry
Reference: dl.acm.org doi/10.1561/1500000005
Stylometric Analysis
Reference: MTLD, vocd-D, and HD-D: A validation study of sophisticated approaches to lexical diversity assessment
Reference: Certain Language Skills in Children: Their Development and Interrelationships
Reference: Studies in Language Behavior: I. A Program of Research
Reference: link.springer.com article/10.3758/brm.42.2.381
Reference: A Vector Space Model for Automatic Indexing
Reference: Term-weighting approaches in automatic text retrieval
Reference: Introduction to Information Retrieval, Chapter 6: Scoring, term weighting and the vector space model
Reference: dl.acm.org doi/10.1145/361219.361220
Vector Space Model (Information Retrieval)
Reference: ROUGE: A Package for Automatic Evaluation of Summaries
ROUGE-L Metric
Reference: The Distribution of the Flora in the Alpine Zone
Reference: Introduction to Information Retrieval (Chapter 3: Tolerant retrieval; k-gram indexes and Jaccard coefficient)
Reference: On the resemblance and containment of documents
Reference: nph.onlinelibrary.wiley.com doi/10.1111/j.1469-8137.1912.tb05611.x
Jaccard Similarity Coefficient
Reference: An Empirical Investigation of Statistical Significance in NLP
Reference: Multiple Comparisons Among Means
Reference: Multiple Comparison Procedures
Reference: onlinelibrary.wiley.com doi/book/10.1002
Family-Wise Error Rate in Multiple Hypothesis Testing
Reference: www.jstor.org stable/2282330
Reference: Introduction to Algorithms, Third Edition — Chapter 22.2: Breadth-first search
Reference: The shortest path through a maze
Breadth-First Search (BFS) on Graphs
Reference: MOOCCubeX: A Large Knowledge-centered Repository for Adaptive Learning in MOOCs
Reference: THU-KEG/MOOCCubeX (official code and data release)
Reference: dl.acm.org doi/10.1145/3459637.3482010
MOOCCubeX Large-Scale MOOC Concept-Graph Resource (Yu et al., 2021)
Auditable Strict-Parity Evaluation of Prerequisite-Graph Retrieval for RAG under Leakage Controls
Reference: Auditable Strict-Parity Evaluation of Prerequisite-Graph Retrieval for RAG under Leakage Controls
Reference: Controlled Retrieval-augmented Context Evaluation for Long-form RAG
Reference: arxiv.org abs/2506.20051
CRUX: Controlled Retrieval-Augmented Context Evaluation for Long-Form RAG (Ju et al., 2025)
Reference: Using Anytime Algorithms in Intelligent Systems
Reference: LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models
Reference: List-aware Reranking-Truncation Joint Model for Search and Retrieval-augmented Generation
Reference: A comparison of statistical significance tests for information retrieval evaluation
Reference: dl.acm.org doi/10.1145/1321440.1321528
Paired Bootstrap Significance Testing and Confidence Intervals for Retrieval Evaluation (Smucker, Allan, and Carterette, 2007)