Reference

Leakage in Data Mining: Formulation, Detection, and Avoidance

This article provides a practiced method of addressing the issue of data leakage in machine learning. Their two step procedure of identifying and preventing data leakage is essential for machine learning scientists to consider in order to produce models with high generalization ability.

Kaufman, S., Rosset, S., & Perlich, C. (2011). Leakage in data mining. Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '11. doi:10.1145/2020408.2020496

http://www.cs.umb.edu/~ding/history/470_670_fall_2011/papers/cs670_Tran_PreferredPaper_LeakingInDataMining.pdf

0

1

Updated 2020-09-17

Tags

Data Science