Concept
Improved Concept Embeddings for Learning Prerequisite Chains: Training set
A corpus of total 7,472 text files, consisting of two parts:
- LectureBank(Li et al., 2018): a manually-collected dataset of 1352 lecture slide presentations from 60 courses covering 5 domains: Natural Language Processing (nlp), Machine Learning (ml), Artificial Intelligence(ai), Deep Learning (dl), and Information Retrieval (ir).
- TutorialBank (Fabbri et al., 2018): a manually-collected dataset of over 6000 resources, ranging from HTML pages (.txt) to lecture slides and textbooks (.pdf, .pptx, and .ppt), mainly in the domain of NLP.
0
1
Updated 2020-08-04
Tags
Data Science