Learn Before
Concept
What is data cleaning and why is it important?
Data cleaning involves removing errors from data sets and this is extremely important as models follow the “garbage in garbage out” principle in which erroneous data leads to erroneous conclusions. In order to identify errors in the datasets, a variety of rules, patterns, statistical techniques, and more can be used. Common errors include incomplete rows, duplicate rows, incorrectly labeled data, and more.
0
1
Updated 2020-10-09
Tags
Data Science