Learn Before
Concept
Data leakage
Data leakage is when information that should not be in the training set is included. This data includes information that a model would not have access to in the future and this leads to a model that is overly accurate for the training data, but will not be nearly as effective once it is tested. Data leakage will become apparent if the training model is much better than the test results and to prevent data leakage, be aware of the data that is included in the training data.
0
1
Updated 2020-10-09
Tags
Data Science